This document is a resume for Yu Wang, who is pursuing an MS in Computer Science from UT Dallas with a 3.35 GPA. Wang has experience in web development, big data, databases, and programming languages like Java, C#, Python, R and SQL. He is looking for a summer/fall 2016 internship in computer science. Some of Wang's projects include developing predictive models using Spark and machine learning algorithms, building web applications using ASP.NET and AngularJS, and performing data analysis on large datasets with tools like Hadoop, Pig, and Hive.
Functional programming for optimization problems in Big DataPaco Nathan
Enterprise Data Workflows with Cascading.
Silicon Valley Cloud Computing Meetup talk at Cloud Tech IV, 4/20 2013
http://www.meetup.com/cloudcomputing/events/111082032/
Functional programming for optimization problems in Big DataPaco Nathan
Enterprise Data Workflows with Cascading.
Silicon Valley Cloud Computing Meetup talk at Cloud Tech IV, 4/20 2013
http://www.meetup.com/cloudcomputing/events/111082032/
Slides for a presentation I gave for the Machine Learning with Spark Tokyo meetup.
Introduction to Spark, H2O, SparklingWater and live demos of GBM and DL.
Updated version of "Big Data Science in Scala" featuring Spark pipelines and hyper-parameter optimization techniques. This talk presents you how three scala libraries - Smile, Saddle and Spark ML - satisfy requirements of new Big Data Science projects. Let's see it on example of click-through rate prediction.
Which library should you choose for data-science? That's the question!Anastasia Bobyreva
This talk presents you the data-science ecosystem in two languages : Python and Scala. It demonstrates the use of their libraries on real dataset to solve binary classification problem with decision tree algorithm.
Introducing apache prediction io (incubating) (bay area spark meetup at sales...Databricks
PredictionIO cofounder and creator Donald Szeto presents what, why and how of Apache PredictionIO as a Machine Learning framework running on top of Apache Spark.
GraphFrames: DataFrame-based graphs for Apache® Spark™Databricks
These slides support the GraphFrames: DataFrame-based graphs for Apache Spark webinar. In this webinar, the developers of the GraphFrames package will give an overview, a live demo, and a discussion of design decisions and future plans. This talk will be generally accessible, covering major improvements from GraphX and providing resources for getting started. A running example of analyzing flight delays will be used to explain the range of GraphFrame functionality: simple SQL and graph queries, motif finding, and powerful graph algorithms.
This talk presents you how three scala libraries - Smile, Saddle and Spark ML - satisfy requirements of new Big Data Science projects. Let's see it on example of click-through rate prediction.
How Deep Learning Will Make Us More Human Again
While deep learning is taking over the AI space, most of us are struggling to keep up with the pace of innovation. Arno Candel shares success stories and challenges in training and deploying state-of-the-art machine learning models on real-world datasets. He will also share his insights into what the future of machine learning and deep learning might look like, and how to best prepare for it.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MorePaco Nathan
Spark and Databricks component of the O'Reilly Media webcast "2015 Data Preview: Spark, Data Visualization, YARN, and More", as a preview of the 2015 Strata + Hadoop World conference in San Jose http://www.oreilly.com/pub/e/3289
Applied Machine learning using H2O, python and R WorkshopAvkash Chauhan
Note: Get all workshop content at - https://github.com/h2oai/h2o-meetups/tree/master/2017_02_22_Seattle_STC_Meetup
Basic knowledge of R/python and general ML concepts
Note: This is bring-your-own-laptop workshop. Make sure you bring your laptop in order to be able to participate in the workshop
Level: 200
Time: 2 Hours
Agenda:
- Introduction to ML, H2O and Sparkling Water
- Refresher of data manipulation in R & Python
- Supervised learning
---- Understanding liner regression model with an example
---- Understanding binomial classification with an example
---- Understanding multinomial classification with an example
- Unsupervised learning
---- Understanding k-means clustering with an example
- Using machine learning models in production
- Sparkling Water Introduction & Demo
Slides for a presentation I gave for the Machine Learning with Spark Tokyo meetup.
Introduction to Spark, H2O, SparklingWater and live demos of GBM and DL.
Updated version of "Big Data Science in Scala" featuring Spark pipelines and hyper-parameter optimization techniques. This talk presents you how three scala libraries - Smile, Saddle and Spark ML - satisfy requirements of new Big Data Science projects. Let's see it on example of click-through rate prediction.
Which library should you choose for data-science? That's the question!Anastasia Bobyreva
This talk presents you the data-science ecosystem in two languages : Python and Scala. It demonstrates the use of their libraries on real dataset to solve binary classification problem with decision tree algorithm.
Introducing apache prediction io (incubating) (bay area spark meetup at sales...Databricks
PredictionIO cofounder and creator Donald Szeto presents what, why and how of Apache PredictionIO as a Machine Learning framework running on top of Apache Spark.
GraphFrames: DataFrame-based graphs for Apache® Spark™Databricks
These slides support the GraphFrames: DataFrame-based graphs for Apache Spark webinar. In this webinar, the developers of the GraphFrames package will give an overview, a live demo, and a discussion of design decisions and future plans. This talk will be generally accessible, covering major improvements from GraphX and providing resources for getting started. A running example of analyzing flight delays will be used to explain the range of GraphFrame functionality: simple SQL and graph queries, motif finding, and powerful graph algorithms.
This talk presents you how three scala libraries - Smile, Saddle and Spark ML - satisfy requirements of new Big Data Science projects. Let's see it on example of click-through rate prediction.
How Deep Learning Will Make Us More Human Again
While deep learning is taking over the AI space, most of us are struggling to keep up with the pace of innovation. Arno Candel shares success stories and challenges in training and deploying state-of-the-art machine learning models on real-world datasets. He will also share his insights into what the future of machine learning and deep learning might look like, and how to best prepare for it.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MorePaco Nathan
Spark and Databricks component of the O'Reilly Media webcast "2015 Data Preview: Spark, Data Visualization, YARN, and More", as a preview of the 2015 Strata + Hadoop World conference in San Jose http://www.oreilly.com/pub/e/3289
Applied Machine learning using H2O, python and R WorkshopAvkash Chauhan
Note: Get all workshop content at - https://github.com/h2oai/h2o-meetups/tree/master/2017_02_22_Seattle_STC_Meetup
Basic knowledge of R/python and general ML concepts
Note: This is bring-your-own-laptop workshop. Make sure you bring your laptop in order to be able to participate in the workshop
Level: 200
Time: 2 Hours
Agenda:
- Introduction to ML, H2O and Sparkling Water
- Refresher of data manipulation in R & Python
- Supervised learning
---- Understanding liner regression model with an example
---- Understanding binomial classification with an example
---- Understanding multinomial classification with an example
- Unsupervised learning
---- Understanding k-means clustering with an example
- Using machine learning models in production
- Sparkling Water Introduction & Demo
Johnny Leon Web, UI/UX, Graphic, Motion and Print DesignerJohnny Leon
Creative, enthusiastic and self motivating corporate specialist seeking to leverage leadership, technical and creative skills to hit the ground running during transition into a creative role.
Masters of Computer Science Candidate with around 3 years of work experience in Java EE, Spring MVC, Hibernate, JavaScript, JQuery, Back End Software Development. Looking for an opportunity as a Full Stack Developer
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
1. Yu Wang(Max)
2600 Waterview pkwy ■ Richardson, Texas 75080 Visa Type: F1(have SSN)
Mobile: (972)-961-6952 Linkedin: www.linkedin.com/in/0yuwang0
Email: yxw124430@utdallas.edu Github: https://github.com/MaxWang0
Objective To obtain a challenging Co-op/Internship position (2016 summer/Fall) in the field of Computer
Science.
SPECIALIZATION Web development with ASP.NET MVC and nodeJS Big data development in spark and mongodb
&INTERESTS Software Development using Java Data manipulation and data analysis using R, Perl and Python
Database design using MySQL and MSSQL
EDUCATION The University of Texas at Dallas, Richardson, TX
M.S. Computer Science, Anticipated Graduation – December 2016 GPA: 3.35
COMPUTER Programming Languages: Java, JDBC, JSP, Servlet, C#, C++, Perl, R, Python, Scala
SKILLS Big Data: Google Cloud, MongoDB, Spark, Hadoop, Pig, Hive, Cassandra, Mahout, Recommendation System
Web Design: JavaScript, ASP.NET MVC, PHP, JSON, AJAX, JQuery, SOA,
Software/IDE/Frameworks: Eclipse, Android Studio, Vim, Visual Studio, MSSQL Server,
MySQL workbench, ASP.NET MVC, Azure, IIS, Apache, Angular JS, React JS, Node JS
Scripting: Perl, Bash, Python, JavaScript
Database Systems: MongoDB, MySQL, Oracle, MSSQL, Stored Procedure
Operating Systems: Apache Linux server, Ubuntu Linux Shell scripting, Windows, Git
WORK EXPERIENCE
Web Big Data Engineer Ecomotto(Start up), Richardson, TX December 2015 – present
Built search engine for web data retrieval with Java and shell script in Google Cloud Server
Assisted web service connection by customizing soap request and data retrieval in ftp server
Performed real-time data manipulation and analysis with spark and mongodb in scala and sparkR
Developed server side programs for real-time data retrieval with Java, shell-scripting and mongodb
Implemented AngularJS, meteorJS and reactJS(D3.js) for UI design and data visualization
Designed predictive model for data real-time prediction with accumulative logistic regression in sparkR
Research Assistant on Data Analysis University of Texas at Dallas, Richardson, TX Fall 2012 – Summer 2015
Retrieved sample data from 2TB data collection with Perl mapping function and execute in shell script in Apache Server.
Developed predictive models to improve the detection rate of data statistical bias with logistic regression, Bayesian Scheme,
HMM and EM algorithm in R and C++.
Performed accuracy validation in ROC curve with R graph to revise the parameter.
Research Assistant Beijing Genomic Institute(BGI), ShenZhen, China July 2011 - December 2011
Performed auto detection program in shell script and manipulate data with Perl and R programming in Apache Server
Bench work on sample preparation for raw data production, including whole exome capture form patients with single gene
PROJECTS
Microsoft Malware Classification with Apache Spark (Microsoft Kaggle Challenge) Summer 2015
Developed predictive model with Scala in Spark to classify 9 classes of Malware data(45GB, bytes, asm) based on its content
pattern and characteristics with machine learning algorithms such as Naïve Bayesian Scheme, Decision Tree and Random Forest.
Technologies(Skill Sets): Spark, Scala, Python, Mahoot, Yarn, Naïve Bayesian, Decision Tree, Random Forest, PCA
ChinaTea Retail Web Application Development Fall 2014
Developed a China tea eCommerce website (http://teahome.azurewebsites.net/) with ASP.NET MVC framework using visual C#
to display different products and perform shopping functions consisting of shopping cart and checkout.
Deployed database system in Windows Azure to store, retrieve and update information about different categories of tea and
users.
Technologies(Skill Sets): ASP.NET MVC , C#, HTML5, CSS3, MSSQL, JavaScript, JQuery, Visual Studio, Windows Azure
2. OnlineLibrary Management System with GUI Fall 2013
Developed a library management system by using Java swing, MySQL Server and JDBC.
The users can search for availability of books, check in and check out books, pay their late fee online and an email is sent to the
corresponding users about their upcoming deadline to check in their books. The system also sends an email to the
corresponding user about the upcoming deadline to return their books.
Technologies(Skill Sets): Java, MySQL, Eclipse, MySQL workbench
Android Mobile Contact Manager UI development Fall 2016
Designed and developed an Android application of Contact Manager with common user interface and functions.
Utilized Intent to retrieve variables from different activity, create tab in tabhost, user can add, edit, delete and save the item
locally as needed. Apply Android design principles for the whole application.
Technologies(Skill Sets): Java, XML, Android Studio
Hadoop Relational Operations on IMDB Datasets Summer 2015
Implemented various complex Pig Latin, UDF, Hive Queries and Cassandra Queries to gain insightful analytics of IMDB movie
database.
Utilized HIVE and PIG frameworks to perform relational operations including joins, co-groups, etc. to analyze some properties of
IMDB data such as movie preferences of male and female users.
Technologies(Skill Sets): Pig, Hive,Cassandra, Map-Reduce, Hortonworks
Big Data Analysis on Online Purchase Data Summer 2015
Designed Hadoop Map-Reduce applications by running Chained map-reduce jobs to derive statistics such as top 10 most
popular stores, number of purchases in a particular product type etc.. from Online Purchase Dataset.
Technologies(Skill Sets): Python, Hadoop Framework, Map-Reduce, Linux, HDFS, Cloudera
Derive statistics from Yelp Dataset Summer 2015
Designed Hadoop Map-Reduce application to retrieve top 10 average rated business in Yelp Dataset using Java.
Implemented Chaining of Map Reduce job along with both in memory and Reduce side join. Achieved desired output using
secondary sorting and custom partitioning in MapReduce Job.
Technologies(Skill Sets): Java, Hadoop Framework, Map-Reduce, Linux, HDFS, Hortonworks
Netflix Recommendation System Summer 2015
Implemented itembased collaborative filtering with mahoot’s spark-itemsimilarity to perform business recommendation based
on certain users.
Technologies(Skill Sets): Scala, Apache Spark, Mahoot, Yarn
Post Office simulation with multiple threads Spring 2015
Implemented Java Threads and Semaphores to model customer and employee behavior in post office.
Created threads to simulate customers and postal workers respectively and utilized semaphore for the coordination between
customer thread and postal worker thread. Mutual exclusion was kept to a minimum to allow the most concurrency.
Technologies(Skill Sets): Java, Linux, Vim, Eclipse
Computer System simulation Spring 2015
Simulated computer system consisting of a CPU and Memory by Multi-processes which simulates computer instruction cycle.
The computer can run programs written by specific instructions.
Created 2 processes as CPU and Memory. The CPU has 6 registers and 1 cache. The memory has 1000 addresses. The two
processes can communicate to each other. The CPU can get instruction from memory then perform the calculation of fetch
data.
Technologies(Skill Sets): Java, Linux, Vim, Eclipse
RELEVANT The University of Texas at Dallas - Erik Johnson School of Engineering & Computer Science
COURSES Big Data Management and Analytics Database design
Web Programming Languages UI Design and Mobile Application
Design and analysis of Computer Algorithms Algorithm analysis and data structure
Machine learning Operating system concept
Cloud Computing Statistical method in data science
ACTIVITY Activity Designer of Friendship Association of Chinese Students and Scholars at UT Dallas
AVAILABILITY Summer /Fall 2016