AI&MLOps Platform offers ML model development environments optimized for cloud, enabling Kubernetes-based linking with various open source software.
The standardized environments support a range of machine learning frameworks from TensorFlow, PyTorch, scikit-learn, and Keras. The pipeline for the entire development, learning and deployment processes of machine learning models are automated to ensure simple configuration/creation as well as reuse of the models.
AI&MLOps Platform provides various features for configuring MLOps environments, including distributed learning job execution and monitoring, inference service management and analysis, and job queue management. Users can also enjoy job schedulers (FIFO, Bin-packing, and Gang-based), GPU fraction, GPU resource monitoring and more add-on features for efficient GPU resource utilization. In particular, a BM-based multi-node GPU and GPUDirect RDMA (Remote Direct Memory Access) help achieve faster processing for large language model (LLM) and natural language processing (NLP).
- Create AI platform (auto-deployment/configuration), view (platform version, resource status), and delete
- Provide Jupyter Notebook : Model development, learning, inference
- Automate machine learning pipeline workflow
- Provide other open source Kubeflow default feature
- Advanced AI/ML platform dashboard
- AI/ML notebook server : Base image, user-defined image
- AI/ML job : Job creation, template, archive, scheduling, execution, monitoring
※ Support GPU resource monitoring, GPU fraction
- Build and manage user image
- AI JumpStarter and ETM (Experiment Tracking Management)
- Serving : Dashboard, register/manage model, inference, predictions, and visualization
- Managing platform resource : Manage resource usage by project, monitor resource usage
- Manage project user/permissions, admin feature, adjust platform configuration
Whether you’re looking for a specific business solution or just need some questions answered, we’re here to help