Type: 3 Days WFH and 2 Days WFO
Location: Bangalore
Notice-period: Immediate/15 days
Budget: As per Company Norms
Technology: IT
Responsibilities:
- Studying, transforming, and converting data science prototypes
- Deploying models to production
- Training and retraining models as needed
- Analyzing the ML algorithms that could be used to solve a given problem and ranking them by their respective scores
- Analyzing the errors of the model and designing strategies to overcome them
- Identifying differences in data distribution that could affect model performance in real-world situations
- Performing statistical analysis and using results to improve models
- Supervising the data acquisition process if more data is needed
- Defining data augmentation pipelines
- Defining the pre-processing or feature engineering to be done on a given dataset
- To extend and enrich existing ML frameworks and libraries
- Understanding when the findings can be applied to business decisions
- Documenting machine learning processes
Basic requirements:
- 4+ years of IT experience in which at least 2+ years of relevant experience primarily in converting data science prototypes and deploying models to production
- Proficiency with Python and machine learning libraries such as scikit-learn, matplotlib, seaborn and pandas
- Knowledge of Big Data frameworks like Hadoop, Spark, Pig, Hive, Flume, etc
- Experience in working with ML frameworks like TensorFlow, Keras, OpenCV
- Strong written and verbal communications
- Excellent interpersonal and collaboration skills.
- Expertise in visualizing and manipulating big datasets
- Familiarity with Linux
- Ability to select hardware to run an ML model with the required latency
- Robust data modelling and data architecture skills.
- Advanced degree in Computer Science/Math/Statistics or a related discipline.
- Advanced Math and Statistics skills (linear algebra, calculus, Bayesian statistics, mean, median, variance, etc.)
Nice to have
- Familiarity with Java, and R code writing.
- Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world
- Verifying data quality, and/or ensuring it via data cleaning
- Supervising the data acquisition process if more data is needed
- Finding available datasets online that could be used for training