Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Resume Resume NAME : Gunjan Kumar Gupta CURRENT OCCUPATION : Machine Learning Scientist,, Seattle, WA INDUSTRY WORK EXPERIENCE : 10 years (Data Mining + Software Engineering ) WORK EXPERIENCE : • Nov 2006 – Present, May – Aug 2005, Machine Learning Scientist, • Jan 2004 – Aug 2006, Research Assistant, NSF/UT • May 2002 – Jan 2004, Standard & Poor’s, New York. • June 2000 – May 2002, i2 Technologies, Austin, TX. • May '99 – May 2000, Net Perceptions, Austin, TX. • Aug '98 – May ’99, Teaching Assistant, UT-Austin. • Sept. '97 - June '98, MCI, Colorado Springs. • April '97 to July'97 Samsung Electronics, Seoul. • June '95 to March '97, Infosys Techologies. Ltd., India. EDUCATIONAL BACKGROUND : • PhD in Data Mining/Comp. Eng., UT Austin, 2006 • MS in Data Mining/Comp. Eng., UT Austin, 2000. • BS in Computer Science from IT-BHU (part of IIT- JEE), India, 1995. SYSTEMS WORKED ON : HP, Sun, DEC Unix, DEC-VMS, AIX, NCR COMPUTER LANGUAGES KNOWN : Java, C++, C, Perl, Pascal, HTML (CGI), SQL Plus, MySQL, MATLAB, SAS LANGUAGES FAMILIAR WITH : LISP, Fortran, Smalltalk, PROLOG, 8086 Assembly OPERATING SYSTEMS : Linux, Windows, HP-UX, Solaris, DEC-Unix, VMS DATABASES WORKED ON : Oracle, MySql, DB2, Orchestrate, SAS VISA STATUS : Green Card Last updated: April 2010 WORK EXPERIENCE (latest projects first) 01 Nov 2006 – Present Machine Learning Scientist,, Seattle, WA Developing and deploying scalable implementations of machine-learning algorithms. Using it for predicting various aspects of merchant and customer behavior to enhance quality of service, and to reduce financial and other types of risks for Amazon. Providing my expertise and guidance to multiple groups and departments at Amazon on strategic machine-learning technologies and issues. Domains included many areas in fraud, product demand forecasting, item and website recommendation, matching and categorization. In particular dealing with learning problems involving highly skewed priors, sparse features, adversarial learning settings and incomplete and noisy labels. Supervised and unsupervised learning on text data, learning on massive amounts of real-time streaming data, and learning on temporal and historical profiles. Platform : Linux, Windows XP, Perl, Java, Hibernate, Spring, C++, Matlab, Shell-scripting, SAS, SQL, Oracle, MySQL, Weka, Amazon SDE platforms 02 Jan 2004 – Oct 2006 Research Assistant for Professor Joydeep Ghosh, UT-Austin, Austin, TX An NSF RA grant directly supported my Ph.D. dissertation work; I have developed algorithms that can discover high purity clusters in unsupervised, large, very noisy, high-dimensional datasets where most of the data points do not cluster well. Application domains include Bioinformatics, market-basket data, web data, biometrics and anomaly detection. (e.g. Gene DIVER: Platform : Linux, Windows 2000, Perl, Java, SWING, Matlab, Weka Gunjan Kumar Gupta, 4839 130tthAve SE, Bellevue, WA 98006, Ph(r): (425) 818-1551 Email:, Online Resume: Page 1
  2. 2. Resume 03 May 2005 – Aug 2005, Risk Management Group, Seattle, WA I was involved in developing a fraud detection model using historical data. I designed and developed a MySql database and a modeling server from ground up that aggregates and summarizes modeling data from massive amounts of continuous raw data from multiple live databases and an online probabilistic model that adapts to new fraud patterns automatically, as they emerge, and predicts new fraud activity. Also my trip to Bonn, Germany as a first author at ICML-2005 was sponsored by to help recruit researchers for Amazon. Platform : MySql (DBA+programming), J2SE 5.0, Red Hat, Shell scripting, reg-ex 04 May 2002 – Jan 2004 Standard & Poor’s, Risk Solutions Group, Santa Fe, NM/ New York, NY Involved in research and development of financial models and algorithms for the Risk Solutions Group at Standard & Poor’s using historical data on companies, countries and individuals, using logistic regression, neural-networks, and other probabilistic methods. Platform : Windows, Matlab, Linux,JBuilder,Java, C++, SAS, SQL, CVS, Perl. 05 June 2000 – May 2002 i2 Technologies, Austin, TX( Involved in development of CRM and Data Mining applications in Oracle SQL and Java, mainly: Trending: Architect, designer and developer of a communication API to enable collection of information from e-commerce customer web sites for a CRM product, involving extensive OOAD and integration with existing i2 products and Rightworks. Marketing Analytics and Campaign Manager: I was involved with design, research and development of this product. It used data collected from Trending and other sources and provided real-time automated decision support, targeted advertisement and campaign management. Algorithms included advanced cross-sell recommendations and advising the PM on Data Mining. Platform : NT 4.0 Server, Borland JBuilder, Java 2 SDK, C++, InstallAnywhere, Oracle 8i, ClearCase, JRun 3.0, JSP 06 May 1999 - May 2000 KD1/Net Perceptions, Austin, TX Involved in developing algorithms for market-basket analysis and clustering, an extension of which became my Masters thesis that also resulted in two publications. Challenges included size and dimensionality (~100,000 products) of the data. Platform : NT 4.0, AIX, Sun, Matlab, Orchestrate, Korn-Shell, C++ 07 Sept 1997 - June, 1998 MCI, Colorado Springs Extensive OO design and development of class libraries for Call-Processing software used in conjunction with DAP for regulatory routing of telephone calls on the MCI network. Platform : VMS, DEC-Unix and Windows NT 4.0, C++, Object-Broker, Object-Store, CORBA, X-Motif, Rogue Wave & UIMX, Visual C++ 4.0, IDL 08 March 1997 - July, 1997 Samsung Electronics, Seoul, South Korea Design and development of Network Management Systems on Windows 95 platform for SR4024 a multi- protocol router and SH2024 hub using winSNMP library and Visual C++. Also ported the NMS onto Sun and HP platforms using WindU Tool. Platform : SR4024/Multi router, SH2024 hub, Pentium m/c, Sun Sparc and HP M/c, Windows 95, HPUX, SunOS, WindU, NetXRay, SNMPc, agent software and MIB compiler. 09 July 1995 - March, 1997 Infosys Technologies Ltd., Bangalore, India ( Involved in many projects for Infosys clients including: Sept 1996 - March 1997: OOAD & dev. of new class libraries and modules for Datavision, an analysis/visualization product for Nortell. May 1996 - Sept. 1996: Developed an Extended MAPI interface for email support on Inconcert, a Workflow Automation product from Xsoft, a part of Xerox Inc. Feb 1996 – May 1996 Designed and developed IMAP, a multimedia prototype client and part of the server for Nynex S&T Lab, Bangkok. Involved in on-site demo with Nynex’s customers. Sept 1995 – Feb 1996: Designed and developed a back-end parser and X- Motif user-interface called SLLBFM for the SLL language for Nynex S&T. Gunjan Kumar Gupta, 4839 130tthAve SE, Bellevue, WA 98006, Ph(r): (425) 818-1551 Email:, Online Resume: Page 2
  3. 3. Resume Platform : HP,Sun,Openwin,Fore-ATM,VAT and VIC,NVATM,Windows,XRT, Motif,BSD and Windows sockets,Extended MAPI,IPC,Exchange, GNU&Msft C++,Rogue Wave Tools.h++ &Views.h++, UIMX, X-Windows Motif 1.2/X11R5 EDUCATION & RESEARCH BACKGROUND August 1998 – 2000, January University of Texas at Austin. Continued collaboration with UT IDEAL 2003-October 2006 group since 2006. Served as a Reader for a UT Austin graduate student’s 2010 machine learning focused Masters thesis. January 2004 –October 2006: PhD in Data Mining (Computer Engineering). August 1998 - June 2000: MS in Data Mining (Computer Engineering). Coursework: Advanced Topics in Data Mining, Bioinformatics, Arch. & App. Of Biological Databases, Data Mining, Machine Learning, Digital Image Processing, Artificial Neural Networks, Optimization of Engineering Systems, Knowledge Representation, Practicum in Data-Mining (involving a project for Dell), Software Engineering Metrics, CPU Optimization for DSS Systems, Natural Language Processing. Masters Thesis: Gupta, G. “Modeling Customer Dynamics using Motion Estimation in a Value Based Cluster Space for Large Retail Data-sets.” MS Thesis, Department of Electrical and Computer Engineering University of Texas ( Refereed publications ( Journals: 1. G. Gupta, J. Ghosh, Bregman Bubble Clustering: A Robust Framework for mining Dense Clusterings, ACM Transactions on Knowledge Discovery from Data, 2(8), July 2008 2. G. Gupta, A. Liu, J. Ghosh, Automated Hierarchical Density Shaving: A robust, automated clustering and visualization framework for large biological datasets, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17 March 2008 Conferences: 3. M. Deodhar, H. Cho, G. Gupta, J. Ghosh, I. Dhillon, A Scalable Framework for Discovering Coherent Co-clusters in Noisy Data, (Best Paper Award Honorable Mention), ICML 2009. 4. M. Deodhar, H. Cho, G. Gupta, J. Ghosh, I. Dhillon, Hunting Coherent Clusters in High Dimensional Noisy Datasets, In Workshop on Foundations of Data Mining, ICDM 2008. 5. G. Gupta, J. Ghosh, Bregman Bubble Clustering: A Robust, Scalable Framework for Locating Multiple, Dense Regions in Data, (Runners up Best Research Paper Award), ICDM 2006, December 2006, 12 pages. 6. G. Gupta, A. Liu, J. Ghosh, ierarchical Density Shaving: A clustering and visualization framework for large biological datasets", ICDM 2006 Workshop on Data Mining in Bioinformatics (DMB 2006). 7. G. Gupta, A. Liu and J. Ghosh, Clustering and Visualization of High-Dimensional Biological Datasets using a fast HMA Approximation, In Proc. ANNIE 2006, ASME, November 2006,6 pages 8. G. Gupta and J. Ghosh, Robust One-Class Clustering Using Hybrid Global and Local Search, In Proc. ICML 2005, August 7-11, 2005, Bonn, Germany, pp. 273-280 9. G. Gupta and J. Ghosh, Detecting Seasonal Trends and Cluster Motion Visualization for very High Dimensional Transactional Data, First Siam Conf. On Data Mining, (SDM2001), Chicago, April 2001. 10.G. Gupta and J. Ghosh, Value Balanced Agglomerative Connectivity Clustering, Proc. SPIE Conf. on Data Mining and Knowledge Discovery, SPIE Proc., Orlando, April 2001. 11.G. Gupta, A. Strehl and J. Ghosh. Distance Based Clustering of Association Rules. in Intelligent Engineering Systems Through Artificial Neural Networks, Vol. 9, ASME Press, Proc ANNIE '99, Nov 1999, pp. 759-764. Other Papers: Platform : Linux, Windows 2000, Perl, Matlab, C++, Java, MySql August 1991 - May 1995 B-Tech, Institute of Technology, Banaras Hindu University My undergraduate thesis was on recognition of hand-written numerals and alphabets using a meta-learner to combine the outputs from multiple classifiers. Other projects included Plot4, a chess-like game with AI and learning that won first prize in a IEEE contest, a Pascal to C translator, and a 2.5 months internship at Tata Iron & Steel Corp. Jamshedpur, India involving stove operation simulations. Platform : BGI, Pascal, Borland C/C++, cc, MASM, Unix, Dos, Assemb. 8086 Gunjan Kumar Gupta, 4839 130tthAve SE, Bellevue, WA 98006, Ph(r): (425) 818-1551 Email:, Online Resume: Page 3
  4. 4. Resume Gunjan Kumar Gupta, 4839 130tthAve SE, Bellevue, WA 98006, Ph(r): (425) 818-1551 Email:, Online Resume: Page 4