Etrance Networks offers technology consulting and engineering services in Embedded, Networking, Telecom and Security Products. We have expertise in design and development of communication products like Routers, LTE Packet-Gateway and Signaling, Deep Packet inspection, BRAS, Video CDN, Deep Packet Inspection, HTTP/SSL Load-Balancing and Offload Appliance. Etrance was founded by a team with broad experience with MNCs like Cisco, HP, Motorola and technology startups like Movik, Sonoa.
We have four offerings for our clients:
* Talent Development - Training with mini-projects, customized workshops
* Consulting - Product Ideation, architecture, evaluation, prototyping and System Engineering
* Services - Design and Development of Control-Plane, Forwarding-chipset, Management and Platform features
* Product Engineering - Turnkey model for engineering deliverable that includes architecture, development, QA, documentation.
Etrance Networks offers technology consulting and engineering services in Embedded, Networking, Telecom and Security Products. We have expertise in design and development of communication products like Routers, LTE Packet-Gateway and Signaling, Deep Packet inspection, BRAS, Video CDN, Deep Packet Inspection, HTTP/SSL Load-Balancing and Offload Appliance. Etrance was founded by a team with broad experience with MNCs like Cisco, HP, Motorola and technology startups like Movik, Sonoa.
We have four offerings for our clients:
* Talent Development - Training with mini-projects, customized workshops
* Consulting - Product Ideation, architecture, evaluation, prototyping and System Engineering
* Services - Design and Development of Control-Plane, Forwarding-chipset, Management and Platform features
* Product Engineering - Turnkey model for engineering deliverable that includes architecture, development, QA, documentation.
Ki-Tech Solutions IEEE PROJECTS DEVELOPMENTS WE OFFER IEEE PROJECTS MCA FINAL YEAR STUDENT PROJECTS, ENGINEERING PROJECTS AND TRAINING, PHP PROJECTS, JAVA AND J2EE PROJECTS, ASP.NET PROJECTS, NS2 PROJECTS, MATLAB PROJECTS AND IPT TRAINING IN RAJAPALAYAM, VIRUDHUNAGAR DISTRICTS, AND TAMILNADU. Mail to: kitechsolutions.in@gmail.com
the project is aimed to develop a crime file for maintain a computerized record of all the F.I.R against
crime .The system is desktop application that can be access throughout the police department. This system can be used
as an application for the crime file of the police department to manage the records of different activity of related to
first information report .In such desktop Crime file system we will manage all such activities (like registration of the
complaint updating information, search of particular viewing of the respective reports of crimes) that will save time,
manpower. This software is for police station which provides facility for reporting crimes, complaints, FIR, charge
sheet, prisoner records, and show most wanted criminal’s details.
This system will provide better prospective for the enhancement of organization regarding to quality and transparency
As a software product engineering specialist, Droisys understands the significance of hiring people with the right talent and passion for technology. Our hiring methods have been designed keeping this in mind.
For Software Engineers/ Technical Positions
Arocom is a consulting and solution engineering company with expertise in providing engineering services for AI & Machine Learning, Data Operations & Analytics, MLOps and Cloud Computing.
Our clients include companies within biotech, drug discovery, therapeutics, manufacturing, retail and startups. Our consultants are best in their skills and offer hands-on talent to our clients in achieving their goals.
Ki-Tech Solutions IEEE PROJECTS DEVELOPMENTS WE OFFER IEEE PROJECTS MCA FINAL YEAR STUDENT PROJECTS, ENGINEERING PROJECTS AND TRAINING, PHP PROJECTS, JAVA AND J2EE PROJECTS, ASP.NET PROJECTS, NS2 PROJECTS, MATLAB PROJECTS AND IPT TRAINING IN RAJAPALAYAM, VIRUDHUNAGAR DISTRICTS, AND TAMILNADU. Mail to: kitechsolutions.in@gmail.com
the project is aimed to develop a crime file for maintain a computerized record of all the F.I.R against
crime .The system is desktop application that can be access throughout the police department. This system can be used
as an application for the crime file of the police department to manage the records of different activity of related to
first information report .In such desktop Crime file system we will manage all such activities (like registration of the
complaint updating information, search of particular viewing of the respective reports of crimes) that will save time,
manpower. This software is for police station which provides facility for reporting crimes, complaints, FIR, charge
sheet, prisoner records, and show most wanted criminal’s details.
This system will provide better prospective for the enhancement of organization regarding to quality and transparency
As a software product engineering specialist, Droisys understands the significance of hiring people with the right talent and passion for technology. Our hiring methods have been designed keeping this in mind.
For Software Engineers/ Technical Positions
Arocom is a consulting and solution engineering company with expertise in providing engineering services for AI & Machine Learning, Data Operations & Analytics, MLOps and Cloud Computing.
Our clients include companies within biotech, drug discovery, therapeutics, manufacturing, retail and startups. Our consultants are best in their skills and offer hands-on talent to our clients in achieving their goals.
Profile Summary
14 years of Total Experience in Python Development
10 Years in Leading Teams, Scrum Master and Management
8 Years of experience as Solution Architect in multiple projects.
Open source Contributor in Python Software Foundation
Research & Development, Proof of Concepts, SDLC process
Gathering information from Clients directly and Reporting
Agile Methodology and Cloud Technology SME
Corporate Trainer for Python, Flask and Agile
Conducting Interviews for Python, Linux, C++
Domain Exposure: Banking, Finance, Digital, Network Security, Energy, CFD,
HPSA, Server Automation
My name is Senthil Kumar and I hold a Degree in Master of Computer Application with the total working experience of 9+ years having a good knowledge of Automation test life cycle which includes the designing, coding, testing, documentation and implementation of the automation framework. Expertise in designing of Hybrid framework for web, windows, main-frame and Siebel applications. I had done Automation Test Estimation for the same. I am confident that my presentation skills and high quality and cost effective methodologies can secure big automation projects for the company. As I worked with a global company I am well aware about the work culture of MNCs. My command over QTP/UFT, QC/ALM, VB Scripting, VBA, SQL and languages like C, C++ makes me a qualified Automation Test Engineer.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
1. 1 | P a g e
Shiv Shankar Dutta
Phone – +91-9136819634
Email – shivdutta@protonmail.com
GIT: https://github.com/Shivdutta
Medium: https://medium.com/@shivdutta
LinkedIn: https://www.linkedin.com/in/75ssd/
Experience Summary
Experienced Data Scientist/Data Solution Architect in designing and developing machine learning and
deep learning solutions in line with quality and regulatory standards.
Hands on experience in the areas of Classification Solutions, Regression methods, Decision Trees,
Neural Networks, NLP, Chatbots, Image and Video Analytics.
Data Science Professional with experience in all stages of data processing and insightsdelivery.
Experience working in Start-up, mid and large organization for project/product development, services
and delivery.
Worked with various clients like HDFC, Shell, Indian Governments, AIG, Allstate
Experience in conducting analytics assessment workshop and client requirement gathering.
Gather, evaluate & document business requirements related to analytics, translate to analytics solution
definition & ability to implement using Python
Deep involvement in handling the critical deliverables, benchmarking solutions, driving Key
Metrics, best practices, maintaining Productivity and ensuring projects areprofitable.
Successful liaison between the business users and technical developers, working on Onsite and
offshore model for multiple deliveries. Planning and prioritizing the work product to meet time lines.
Defining analytics product road map in line with Enterprise Architecture Framework.
Design and Development of Machine Learning Models, Deep Learning Models, Data/Text Mining,
Image Analysis Classification and NLP from start to end for POC and Analytics Solutions in any Domain
or any Business Use Case.
Hands on experience in handling unstructured data like image, video and text
Hands on experience in data collection, feature selection and feature engineering from multiple
different data sources like No-SQL DB (Hive, HBase, Mongo DB), Flat files, SQL-DB
Hands on experience in building predictive models in Real Time/Near Real Time/Batch using machine
learning
Exposure in end to end data pipeline creation in enterprise system using different tools: Flume, Kafka,
Spark, Flask, Docker
Model Validation from Testing and Client Stakeholder. Aligning with business and technology for
deployment, validation and acceptance.
Manage and mentor team.
TOGAF 9.1 Certified. Experience in presales activities like technical solution and estimation, POC,
customer demo.
Well versed with Agile/Waterfall methodologies, CMMI Level 5 process, Estimation techniques,
Requirement Gathering and Elicitation, Design using UML techniques.
Hands on Expertise on data governance data lineage data processes DML and data architecture control
execution, Master Data management (MDM) and Data Governance (DG).
Exposure to deployment ML models in Kubernetes and Docker Cluster in premise or hybrid
environment
Exposure in interfacing of IoT/Connected Device/Sensor for data analysis and Edge Analytics.
Model Quantization for hand held devices using tensorlite.
2. 2 | P a g e
Primary Skills
Machine Learning
Regression, K Means Clustering, Decision Tree, SVM, Bayes
Theorem, Naive Bayes, Random Forest, LGBM, XGBoost,Aproiri,Time
Series-ARIMA
Neural Network
CNN, RNN, AutoEncoders, Keras,Pytorch,TensorFlow ,YOLO, Open
CV, SSD, BERT,Resnet,Inception,RCNN,SSD-Mobilenet,
NLP
NLTK, Spacy, POS, Tokenization, Stemming and Lemmitization,BERT,
Word2vec,Glove,Embedding
RDBMS SQL Server, MySQL, PostgreSQL
Statistical Techniques Hypothesis, ANOVA, Chi-Square
No SQL DB MongoDB
Languages Python, R, C#
Enterprise Architecture TOGAF Certified
Chat bots Lex, LUIS, RASA, Dialog flow
Messaging Kafka
Tools PyCaret, AutoML,Jupyter Notebook,Spyder,Anaconda
Methodology Agile & Waterfall
Secondary Skills
Infrastructure Management:
Hortonworks Ambari, Cloudera Hue, Sagemaker, Cloud ML,Docker
and Kubernetes
Transformation: Sqoop, Apache Spark ,Pyspark, Flume
Big Data Hadoop, HDFS, Yarn, Zookeeper
Others Hive, Pig, HBase,Scala
Cognitive Services: Cloud/Open source Cognitive Services
Employment History
Tenure Company Position
Apr 17 to Jun 19 Sequretek Pvt. Ltd. Technology Architect/Data Scientist
Nov 2015 to Mar 17 CGI India Ltd. Solution/Technical Architect/ML(Associate
Consultant)
Oct 2008 to Nov 2015 Rolta India Ltd. Solution/Technical Architect/Project
Manager/Presales Architect/BI/Machine
Learning(Senior Manager)
July 2004 to Oct 2008 Syntel Project Lead
June 2003 to June 2004 L & T Infotech Ltd Software Engineer
3. 3 | P a g e
Project: Hand Detection System for Shredder Machine
Technology/Deep Learning: RCNN, Python, Google Coral
Overview: The project was taken prevent the industrial accident to working near the shredding machine. The Shredding
machine or document shredder is a mechanical device for cutting paper and other media which contain information into
fragments so small that the information can no longer be retrieved.Putting the paper in machine is manual activity and
during this sometime accident occurs. To stop the machine automatically,system should detect the hand beyond and alert
user on crossing beyong limit line. The alert is in form for buzzer, red lighting and stopping of the machine. The GPIO pin of
Coral is connected to machine for stopping. The images of hands were labeled and trained using RCNN.
Role & Responsibilities:
Part of Product Architecture Team.
Model Development and Validation using RCNN
Model Deployment using RCNN and Validation and acceptance.
Development to code for hand detection
Outcome and Contributions:
o High accuracy with more than 95%.
o By automation, accident was completely avoided.
Team Size: 2
Client: Freelance
Project: Manage Detection and Response
Machine Learning: Light GBM and SVM, KNN, Clustering, Neural Network like RNN
Technology: Python, Apache Spark, Pytorch, Apache Kafka, Flume, MongoDB, Jenkins, Git,Hive,Hbase,Ozie
Overview: Manage Detection and Response is a combination of technology and skills to deliver advanced threat detection,
deep threat analytics, global threat intelligence, faster incident mitigation, and collaborative breach response on a 24x7
basis. The endpoints (IOT devices and Enterprise servers) have system scanning done by EMS system or scan component.
Apache Flume agents capture the logs and sends to Topic in Apache Kafka. Files received are server logs, client profile,
schema profile,network settings. The Master Data Management was setup for data profile and schema as part of Data
sharing agreement with client. In Data Ingestion, the queue is consumed using Apache Spark Batch and Steam
component. Data Quality services were setup as part data validation. Multiple Layer were setup in Spark component as
part of data validation and cleansing, standardization and transformation.The data is reduced and stored in HBase for
Machine Learning Hive was setup for data archiving . Ozie was used for scheduling.
Machine Learning Algorithm techniques like Ensemble Learning & Boosting like SVM, Light GBM, Random Forest, Kmeans
Clustering, KNN, RNN are applied to best possible result is derived.
Nov 2002 to June 2003 Amtech Communication Senior Application Developer
Sep 2001 to Oct 2002 Nazara.com Senior Programmer
Sep 1999 to Sep 2001 Ideaz Netechnologies Programmer
Recognitions
Manager Award in L&T Infotech
CMMI Level 5 participation in Rolta
Qualifications
Bachelor of Engineering in Electronics and Telecommunication, Marathawada University. First Class.
Certifications
Coursera - Neural Networks 2019-03 and Deep Learning
Open Group: 2018-02 Open Group - TOGAF 9.1 Certified
Udemy: 2019-07 Deep Learning with TensorFlow 2.0 [2019]
Udemy: 2017-02 Big Data with Spark Streaming and Pyspark
4. 4 | P a g e
The data is processed using Machine Learning for threat detection. The output is stored in Mongo DB and displayed in
dashboard. Exception, Audit Trail were and service accounts were created as part of framework.
Role & Responsibilities:
Part of Product Architecture Team.
Feature Selection and Engineering for Web Attack, Network Attack, Malware Attack.
Model Development and Validation using Light GBM and SVM, Neural Network like RNN
Model Deployment and Validation and acceptance.
Collaborating with agile with cross-functional teams
Development of data ingestion, log processing component using Apache Spark/Flume, Kafka and HDFS and
MongoDB
Outcome and Contributions:
o Complexity in performing Feature Engineering is avoided
o High accuracy with more than 95% with compared to manual classification around 60%
o By automation, 4 people per month manual effort for each customer is saved.
o Approximately 5 to 10 licenses cost of HP ArcSight for each customer is saved.
Team Size 15
Client: HDFC, DMart, Product Development
Project: EDPR Machine Learning
Machine Learning: Light GBM and SVM, KNN, Clustering,Random Forest
Technology: Python, Apache Spark, Pytorch, Apache Kafka, Flume, SQLLite, Jenkins, Git,Hive,Hbase,Ozie
Overview: Endpoint Detection and Protection Response detects, protects and responds to cyberattacks which adds to the
complexity of securing the enterprise. Each of the point products adds an agent to the endpoint and is often managed
independent of the other security technologies present on that endpoint.
Files received are server logs, client profile, schema profile,network settings. The Master Data Management was setup for
data profile and schema as part of Data sharing agreement with client. In Data Ingestion, the queue is consumed using
Apache Spark Batch and Steam component. Data Quality services were setup as part data validation. Multiple Layer were
setup in Spark component as part of data validation and cleansing, standardization and transformation.This involves
Feature Extraction and Feature Engineering for malware based on Static Analysis for PE and PDF file types. The metadata is
extracted from malware samples. The data is reduced and stored in HBase for Machine Learning Hive was setup for data
archiving . Ozie was used for scheduling.
Machine Learning: Thereafter, Data Pre-Processing, Data Cleaning is done. Based on exploratory analysis, regularly model
is created/updated and validated. Machine Learning algorithm techniques like Ensemble Learning & Boosting like SVM,
Light GBM,Random Forest Tensorlite are applied to best possible result is derived.
Role & Responsibilities:
Part of Product Architecture Team.
Model Creation, Data Pre-Processing, Data Cleaning.
Feature Selection and Engineering.
Model Development and Validation using Light GBM and Tensorlite
Model Deployment and Validation and acceptance.
Outcome and Contributions:
o Complexity in performing Feature Engineering is avoided
o Improved Accuracy metrics of 4% from earlier traditional model
o By automation, 10 people per month manual effort is saved.
Team Size 10
Client: Bharat Co-operative Bank, Product Development
Project: Website Security Checking
5. 5 | P a g e
Machine Learning: Light GBM and SVM, KNN, Clustering,Random Forest
Technology: Python, Apache Spark, Pytorch, Apache Kafka, Flume, MongoDB, Jenkins, Git,Hbase,Ozie
Overview: The product was developed to check the website sanity and plug in the product in other products. The product
was used to validate the URL and create the database of Threat Intelligence. The feature URL length, domain registration,
port, https, dns record age google page rank etc. are used in detection. Separate environment is created for data creation
and based on the data captured the features are extracted.
In Data Ingestion, the queue is consumed using Apache Spark Batch component. Data Quality services were setup as part
data validation. Multiple Layer were setup in Spark component as part of data validation and cleansing, standardization
and transformation. The data is reduced and stored in MongoDB for Machine Learning Hive was setup for data
archiving . Ozie was used for scheduling.
Some Samples were collected from third party and some were from server logs of managed services. The is continuous
process. EDA is performed for data standardization. The model is trained incrementally every week and thereby creating
rich threat intelligence repository.
Role & Responsibilities:
Part of Product Architecture Team.
Model Creation, Data Pre-Processing, Data Cleaning.
Feature Selection and Engineering.
Model Development and Validation using Clustering, Boosting and Random Forest.
Model Deployment and Validation and acceptance.
Client: Product Development(Internal)
Team Size 15
Project: SOC Incident Log Ticket Allocation:
Machine Learning: Light GBM and SVM, KNN, Clustering,Random Forest
Technology: Python, Apache Spark, Pytorch, Apache Kafka, Flume, MongoDB, Jenkins, Git,Hive,Hbase,Ozie
Overview: The managed services handled lot of the security related measures for global customers. The SOC (Security
Operations Centre is established to achieve the objective. The ticket resolution time is important for customer and
capacity planning. Initially all the historical data available in excel sheet is used as input for training. EDA, Feature Selection
and Feature Engineering, PCA is done prior to training.
Apache Spark Batch component. Data Quality services were setup as part data validation. Multiple Layer were
setup in Spark component as part of data validation and cleansing, standardization and transformation. The data is
reduced and stored in MongoDB for Machine Learning Hive was setup for data
archiving . Ozie was used for scheduling.
On post training, whenever new ticket arrives, ETA is calculated and on closing variance is also captured. Based on this,
analysis is done and model is retrained.
Role & Responsibilities:
Part of Product Architecture Team.
Model Creation, Data Pre-Processing, Data Cleaning.
Feature Selection and Engineering.
Model Development and Validation using Clustering, Boosting and Random Forest.
Model Deployment and Validation and acceptance.
Client: HDFC, AEGON, Reliance, Product Development(Internal)
Team Size 10
Project: Malicious PDF file detection:
Machine Learning: Light GBM and SVM, KNN, Clustering,Random Forest
Technology: Python, Apache Spark, Pytorch, Apache Kafka, Flume, SQLLite, Jenkins, Git,Hive,Hbase,Ozie
6. 6 | P a g e
Overview: The PDF file is source of malware or hyperlinks to phishing site. The product was developed to check the
document sanity and plug in to other product. The PDF file have not of attributes or semi structures or snippets where
exploit can be easily injected. The features like java script, rich media, JBIG2Decode, Open Action etc. are vulnerable to
attack. Samples were collected from third part.
Apache Spark Batch component. Data Quality services were setup as part data validation. Multiple Layer were
setup in Spark component as part of data validation and cleansing, standardization and transformation. The data is
reduced and stored in MongoDB for Machine Learning Hive was setup for data
archiving . Ozie was used for scheduling.
Some Samples were collected from third party and internal threat repository. The is continuous process. EDA is performed
for data standardization. The model is trained incrementally every week and thereby creating rich threat intelligence
repository.
Role & Responsibilities:
Part of Product Architecture Team.
Model Creation, Data Pre-Processing, Data Cleaning.
Feature Selection and Engineering.
Model Development and Validation using Clustering, Boosting and Random Forest.
Model Deployment and Validation and acceptance.
Client: HDFC, AEGON, Reliance, Product Development(Internal)
Team Size 10
Project: IGA
Machine Learning: Light GBM and SVM, KNN, Clustering,Random Forest
Technology: Python, Apache Spark, Pytorch, Apache Kafka, Flume, MongoDB, Jenkins, Git,Hbase,Ozie
Overview: IGA is integrated access management and governance product which takes care entire life cycle of employee
engagement (on boarding and exit). During On boarding, employee id is created, access to different system is given after
approval. During exit, all the access and id are revoked. Machine Learning The data collected from multiple system like
attendance system, leave portal, access management, training system, appraisal system and other client multiple system.
Some of the attributes are Employee Salary, Employee Satisfaction, Promotion last year, hourly rate etc.
Apache Spark Batch component. Data Quality services were setup as part data validation. Multiple Layer were
setup in Spark component as part of data validation and cleansing, standardization and transformation. The data is
reduced and stored in MongoDB for Machine Learning Hive was setup for data
archiving . Ozie was used for scheduling.
The collected data cleansed, parsed, validated and thereafter feature selection and engineering, exploratory data analysis
are applied to derive multiple metrics. Powerful dashboard is created using Tableau. Machine Learning Algorithm
techniques like Ensemble Learning & Boosting like Support Vector Machine, XgBoost,PCA are applied to best possible
result is derived.
Role & Responsibilities:
Part of Product Architecture Team.
Model Creation, Data Pre-Processing, Data Cleaning.
Feature Selection and Engineering.
Model Development and Validation using Clustering, SVM, Boosting and Random Forest.
Model Deployment and Validation and acceptance.
Team Size 15
Client: HDFC, AEGON, Axis Bank, Product Development(Internal)
Project: Bank Customer Churn
Machine Learning: Light GBM and SVM, KNN, Clustering,Random Forest
Technology: Python, Apache Spark, Pytorch, Apache Kafka, Flume, MongoDB, Jenkins, Git,Hive,Hbase,Ozie
Overview: The model is developed for Agricultural segment. There are multiple co-operative banks and financial agency
who can support farmers. The Churn based product is developed to find the shift of customers. The data is sourced from
FTP and batch wise processed. Some of the attributes like gender, geography, loan, products, subsidy, income etc. are
considered for model building. FTP was used as data source later replaced with Kafka-spark streams. The is continuous
7. 7 | P a g e
process. EDA is performed for data standardization. The model is trained incrementally every week.
Role & Responsibilities:
Part of Product Architecture Team.
Model Creation, Data Pre-Processing, Data Cleaning.
Feature Selection and Engineering.
Model Development and Validation using Clustering, SVM, Boosting and Random Forest.
Model Deployment and Validation and acceptance.
Client: Bharat Cooperative Bank
Team Size 15
Project: Insurance Claim Prediction
Machine Learning: Light GBM and SVM, KNN, Clustering,Random Forest
Technology: Python, Apache Spark, Pytorch, Apache Kafka, Flume, MongoDB, Jenkins, Git,Hive,Hbase,Ozie
Overview: Overview: The model is developed for special intervention of manual checking of claim based on historical claim
processing data as well as customer specific attributes like policy deductibles, exclusions, umbrella limit, collision details,
vehicle details. Initially FTP was used as data source later replaced with Kafka-spark streams. The is continuous process.
EDA is performed for data standardization. The model is trained incrementally every week and thereby creating rich threat
intelligence repository.
Role & Responsibilities:
Part of Product Architecture Team.
Model Creation, Data Pre-Processing, Data Cleaning.
Feature Selection and Engineering.
Model Development and Validation using Clustering, SVM, Boosting and Random Forest.
Model Deployment and Validation and acceptance.
Client: HDFC ERGO
Team Size 15
Oil Card Fraud Detection(POC)
Overview: The model was developed as part of POC for Oil and Gas Company. In Europe, oil card, provides drivers of large
transport company to buy oil, pay toll as well as in some case allow to buy limited food in the tour. The model detects the
fraud on the transaction on time zone, transactions instances, latitude and longitude. The anomaly detection technique is
used for the same. This was developed on azue ml platform. Data source is csv. EDA is performed for data standardization.
The model is trained incrementally every week and thereby creating rich threat intelligence repository.
Role & Responsibilities:
Part of Product Architecture Team.
Model Creation, Data Pre-Processing, Data Cleaning.
Feature Selection and Engineering.
Model Development and Validation using Clustering, SVM, Boosting and Random Forest.
Model Deployment and Validation and acceptance.
Client: Oil and Gas
Team Size 5
BI Reports
Overview: This project involves migration from legacy platform to SSRS. There were nearly 200 reports. Utility was
developed to migrate sample reports to SS RS platform.
Client: Intel Bangalore
Team Size 30
BI Reports
Overview: This project involves migration from legacy platform to SSRS. There were nearly 40 reports.
Client: ACE Surety, US
8. 8 | P a g e
Team Size 20
R & D Projects
Document Classification using NLP
Asset management using YOLO Object Detection
Chatbot for corona virus using lex and dialog flow and integrated to Facebook and Telegram
Wearing mask prediction for social distancing using RCNN.