Zensors is looking for an experienced machine learning engineer to join our team and help us build and maintain Large Multi Modal Model infrastructure. As part of the role, you will be responsible for designing and implementing scalable and reliable systems that can handle large amounts of data and support complex machine learning algorithms. You should have a strong background in distributed computing, cloud-based technologies, and software engineering principles.
Responsibilities:
1. Design and implement scalable and fault-tolerant machine learning infrastructure to train and develop Large Multi Model models and big data pipelines.
2. Work closely with data scientists to identify and optimize performance bottlenecks in their models and algorithms.
3. Develop tools and automation scripts that help streamline the deployment process for new models and experiments.
4. Ensure high availability and security of all systems by implementing monitoring, logging, and alerting mechanisms.
5. Maintain documentation of all infrastructure components, including architecture diagrams, configuration files, and runbooks.
6. Collaborate with cross-functional teams to drive engineering initiatives such as code reviews, continuous integration/delivery (CI/CD), and testingframeworks.
7. Ensure high availability and security of all systems by implementing monitoring, logging, and alerting mechanisms.
8. Maintain documentation of all model components, including architecture diagrams, configuration files, and runbooks.
9. Stay up-to-date with the latest machine learning research and techniques to continuously improve our applications.
Qualifications:
1. Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
2. 3+ years of experience working with distributed systems, cloud-based technologies, and software engineering principles.
3. Strong programming skills in Python, Java, C++, or similar languages.
4. Familiarity with machine learning frameworks such as TensorFlow, PyTorch, or Keras.
5. Experience with containerization technologies such as Docker or Kubernetes.
6. Excellent communication and collaboration skills.
7. Self-motivated and proactive approach to problem-solving.