This document contains checklists for an OBM presentation and paper. The presentation checklist has 11 required slides that must be included to earn full points, such as a title slide, baseline graph, and question slide. The paper checklist details the required sections for the first and final drafts, including an introduction, definition of key terms, diagrams of various models, and a graph of data. Both checklists must be submitted and signed off by the student and TA, with points awarded based on completion of items. Failure to submit required files on a disk at the final presentation will result in point deductions.
Evidence for my ePortfolio.
Developing data tracking methods to keep students on track, instructors accountable and stakeholders informed is essential to the success of any training program.
http://eportfolio4mwalkerwade.wordpress.com
GPORCA is query optimizer used inside Greenplum database, the first open source MPP solution based on PostgreSQL.
These are slides presented at the PGConf Seattle 2017. It introduced the internals of GPORCA, and provide OSS developers context to contribute back to the project.
쉽게 설명하는 GAN (What is this? Gum? It's GAN.)Hansol Kang
Original GAN 논문 리뷰 및 PyTorch 기반의 구현.
딥러닝 개발환경 및 언어 비교.
[참고]
Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
Wang, Su. "Generative Adversarial Networks (GAN) A Gentle Introduction."
초짜 대학원생의 입장에서 이해하는 Generative Adversarial Networks (https://jaejunyoo.blogspot.com/)
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기 (https://www.slideshare.net/NaverEngineering/1-gangenerative-adversarial-network)
프레임워크 비교(https://deeplearning4j.org/kr/compare-dl4j-torch7-pylearn)
AI 개발에AI 개발에 가장 적합한 5가지 프로그래밍 언어 (http://www.itworld.co.kr/news/109189#csidxf9226c7578dd101b41d03bfedfec05e)
Git는 머꼬? GitHub는 또 머지?(https://www.slideshare.net/ianychoi/git-github-46020592)
svn 능력자를 위한 git 개념 가이드(https://www.slideshare.net/einsub/svn-git-17386752)
Modernisation Strategy for Science at RBG Kew. The presentation is part of a "toolkit" delivered to help Kew to rationalise, consolidate and integrate disparate & legacy Science Applications and Data.
SQL is the most widely used language for data processing. It allows users to concisely and easily declare their business logic. Data analysts usually do not have complex software programing backgrounds, but they can program SQL and use it on a regular basis to analyze data and power the business decisions. Apache Flink is one of streaming engines that supports SQL. Besides Flink, some other stream processing frameworks, like Kafka and Spark structured streaming, have SQL-like DSL, but they do not have the same semantics as Flink. Flink’s SQL implementation follows ANSI SQL standard while others do not.
In this talk, we will present why following ANSI SQL standard is essential characteristic of Flink SQL and how we achieved this. The core business of Alibaba is now fully driven by the data processing engine: Blink, a project based on Flink with Alibaba’s improvements. About 90% of blink jobs are written by Flink SQL. We will show the use cases and the experience of running large scale Flink SQL jobs at Alibaba in the talk.
Speakers
Shaoxuan Wang, Senior Engineering Manager, Alibaba
Xiaowei Jiang, Senior Director, Alibaba
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)Hansol Kang
LSGAN은 기존의 GAN loss가 아닌 MSE loss를 사용하여, 더욱 realistic한 데이터를 생성함.
LSGAN 논문 리뷰 및 PyTorch 기반의 구현.
[참고]
Mao, Xudong, et al. "Least squares generative adversarial networks." Proceedings of the IEEE International Conference on Computer Vision. 2017.
AutoLISP is a dialect of the LISP programming language built specially to use with AutoCAD and its derivatives. It is a subset of the LISP (List Processor) programming language, which is used in bids of artificial intelligence and expert systems. Many functions have been added to the LISP program in order to interface AutoLISP directly to AutoCAD, and you will see that some AutoCAD commands have been retained as AutoLISP functions. Flange Coupling is also a simple type of coupling than others. Here it consists of two flanges one keyed to the driving shaft and the other two the driven shaft. The two flanges are connected with the help of four or six bolts arranged in a concentric circle .In this thesis a flange coupling model is designed with simple programming language. Initially, transmitting power depending on the application is taken as the input for the generating various dimensions of the coupling.
OSMC 2011 | Cacti Graphing Solution by Reinhard ScheckNETWAYS
Cacti ist eine Open Source Graphing Lösung auf Basis der RRD Tools. Neben eigenen Pollern und modularen Schnittstellen zeichnet es sich auch durch eine feine Benutzerautorisierung aus. Der Vortrag beginnt mit einer Einführung in das Performance-Monitoring mit "Cacti" und fokussiert dabei auf die spezifischen Stärken.
Neben einer Beschreibung des aktuellen Entwicklungsstatus liegt der Schwerpunkt auf der "Plugin Infrastruktur" und den damit verbundenen Möglichkeiten. Ein abschließender Ausblick auf das kommende Release rundet den Vortrag ab.
This is 101 introduction of GPORCA for Open Source developers. GPORCA is open source query optimizer for SQL on MPP (massive parallel processing) database system like Greenplum. You can find the overview of GPORCA, as well as how to debug and contribute back to OSS community.
Mining at scale with latent factor models for matrix completionFabio Petroni, PhD
PhD Thesis
F. Petroni:
"Mining at scale with latent factor models for matrix completion."
Sapienza University of Rome, 2016.
Abstract: "Predicting which relationships are likely to occur between real-world objects is a key task for several applications. For instance, recommender systems aim at predicting the existence of unknown relationships between users and items, and exploit this information to provide personalized suggestions for items to be of use to a specific user. Matrix completion techniques aim at solving this task, identifying and leveraging the latent factors that triggered the the creation of known relationships to infer missing ones.
This problem, however, is made challenging by the size of today’s datasets. One way to handle such large-scale data, in a reasonable amount of time, is to distribute the matrix completion procedure over a cluster of commodity machines. However, current approaches lack of efficiency and scalability, since, for instance, they do not minimize the communication or ensure a balance workload in the cluster.
A further aspect of matrix completion techniques we investigate is how to improve their prediction performance. This can be done, for instance, considering the context in which relationships have been captured. However, incorporating generic contextual information within a matrix completion algorithm is a challenging task.
In the first part of this thesis, we study distributed matrix completion solutions, and address the above issues by examining input slicing techniques based on graph partitioning algorithms. In the second part of the thesis, we focus on context-aware matrix completion techniques, providing solutions that can work both (i) when the revealed entries in the matrix have multiple values and (ii) all the same value."
Crude-Oil Scheduling Technology: moving from simulation to optimizationBrenno Menezes
Scheduling technology either commercial or homegrown in today’s crude-oil refining industries relies on a complex simulation of scenarios where the user is solely responsible for making many different decisions manually in the search for feasible solutions over some limited time-horizon i.e., trial-and-error heuristics. As a normal outcome, schedulers abandon these solutions and then return to their simpler spreadsheet simulators due to: (i) time-consuming efforts to configure and manage numerous scheduling scenarios, and (ii) requirements of updating premises and situations that are constantly changing. Moving to solutions based in optimization rather than simulation, the lecture describes the future steps in the refactoring of the scheduling technology in PETROBRAS considering in separate the graphic user interface (GUI) and data communication developments (non-modeling related), and the modeling and process engineering related in an automated decision-making with built-in problem representation facilities and integrated data handling features among other techniques in a smart scheduling frontline.
Evidence for my ePortfolio.
Developing data tracking methods to keep students on track, instructors accountable and stakeholders informed is essential to the success of any training program.
http://eportfolio4mwalkerwade.wordpress.com
GPORCA is query optimizer used inside Greenplum database, the first open source MPP solution based on PostgreSQL.
These are slides presented at the PGConf Seattle 2017. It introduced the internals of GPORCA, and provide OSS developers context to contribute back to the project.
쉽게 설명하는 GAN (What is this? Gum? It's GAN.)Hansol Kang
Original GAN 논문 리뷰 및 PyTorch 기반의 구현.
딥러닝 개발환경 및 언어 비교.
[참고]
Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
Wang, Su. "Generative Adversarial Networks (GAN) A Gentle Introduction."
초짜 대학원생의 입장에서 이해하는 Generative Adversarial Networks (https://jaejunyoo.blogspot.com/)
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기 (https://www.slideshare.net/NaverEngineering/1-gangenerative-adversarial-network)
프레임워크 비교(https://deeplearning4j.org/kr/compare-dl4j-torch7-pylearn)
AI 개발에AI 개발에 가장 적합한 5가지 프로그래밍 언어 (http://www.itworld.co.kr/news/109189#csidxf9226c7578dd101b41d03bfedfec05e)
Git는 머꼬? GitHub는 또 머지?(https://www.slideshare.net/ianychoi/git-github-46020592)
svn 능력자를 위한 git 개념 가이드(https://www.slideshare.net/einsub/svn-git-17386752)
Modernisation Strategy for Science at RBG Kew. The presentation is part of a "toolkit" delivered to help Kew to rationalise, consolidate and integrate disparate & legacy Science Applications and Data.
SQL is the most widely used language for data processing. It allows users to concisely and easily declare their business logic. Data analysts usually do not have complex software programing backgrounds, but they can program SQL and use it on a regular basis to analyze data and power the business decisions. Apache Flink is one of streaming engines that supports SQL. Besides Flink, some other stream processing frameworks, like Kafka and Spark structured streaming, have SQL-like DSL, but they do not have the same semantics as Flink. Flink’s SQL implementation follows ANSI SQL standard while others do not.
In this talk, we will present why following ANSI SQL standard is essential characteristic of Flink SQL and how we achieved this. The core business of Alibaba is now fully driven by the data processing engine: Blink, a project based on Flink with Alibaba’s improvements. About 90% of blink jobs are written by Flink SQL. We will show the use cases and the experience of running large scale Flink SQL jobs at Alibaba in the talk.
Speakers
Shaoxuan Wang, Senior Engineering Manager, Alibaba
Xiaowei Jiang, Senior Director, Alibaba
LSGAN - SIMPle(Simple Idea Meaningful Performance Level up)Hansol Kang
LSGAN은 기존의 GAN loss가 아닌 MSE loss를 사용하여, 더욱 realistic한 데이터를 생성함.
LSGAN 논문 리뷰 및 PyTorch 기반의 구현.
[참고]
Mao, Xudong, et al. "Least squares generative adversarial networks." Proceedings of the IEEE International Conference on Computer Vision. 2017.
AutoLISP is a dialect of the LISP programming language built specially to use with AutoCAD and its derivatives. It is a subset of the LISP (List Processor) programming language, which is used in bids of artificial intelligence and expert systems. Many functions have been added to the LISP program in order to interface AutoLISP directly to AutoCAD, and you will see that some AutoCAD commands have been retained as AutoLISP functions. Flange Coupling is also a simple type of coupling than others. Here it consists of two flanges one keyed to the driving shaft and the other two the driven shaft. The two flanges are connected with the help of four or six bolts arranged in a concentric circle .In this thesis a flange coupling model is designed with simple programming language. Initially, transmitting power depending on the application is taken as the input for the generating various dimensions of the coupling.
OSMC 2011 | Cacti Graphing Solution by Reinhard ScheckNETWAYS
Cacti ist eine Open Source Graphing Lösung auf Basis der RRD Tools. Neben eigenen Pollern und modularen Schnittstellen zeichnet es sich auch durch eine feine Benutzerautorisierung aus. Der Vortrag beginnt mit einer Einführung in das Performance-Monitoring mit "Cacti" und fokussiert dabei auf die spezifischen Stärken.
Neben einer Beschreibung des aktuellen Entwicklungsstatus liegt der Schwerpunkt auf der "Plugin Infrastruktur" und den damit verbundenen Möglichkeiten. Ein abschließender Ausblick auf das kommende Release rundet den Vortrag ab.
This is 101 introduction of GPORCA for Open Source developers. GPORCA is open source query optimizer for SQL on MPP (massive parallel processing) database system like Greenplum. You can find the overview of GPORCA, as well as how to debug and contribute back to OSS community.
Mining at scale with latent factor models for matrix completionFabio Petroni, PhD
PhD Thesis
F. Petroni:
"Mining at scale with latent factor models for matrix completion."
Sapienza University of Rome, 2016.
Abstract: "Predicting which relationships are likely to occur between real-world objects is a key task for several applications. For instance, recommender systems aim at predicting the existence of unknown relationships between users and items, and exploit this information to provide personalized suggestions for items to be of use to a specific user. Matrix completion techniques aim at solving this task, identifying and leveraging the latent factors that triggered the the creation of known relationships to infer missing ones.
This problem, however, is made challenging by the size of today’s datasets. One way to handle such large-scale data, in a reasonable amount of time, is to distribute the matrix completion procedure over a cluster of commodity machines. However, current approaches lack of efficiency and scalability, since, for instance, they do not minimize the communication or ensure a balance workload in the cluster.
A further aspect of matrix completion techniques we investigate is how to improve their prediction performance. This can be done, for instance, considering the context in which relationships have been captured. However, incorporating generic contextual information within a matrix completion algorithm is a challenging task.
In the first part of this thesis, we study distributed matrix completion solutions, and address the above issues by examining input slicing techniques based on graph partitioning algorithms. In the second part of the thesis, we focus on context-aware matrix completion techniques, providing solutions that can work both (i) when the revealed entries in the matrix have multiple values and (ii) all the same value."
Crude-Oil Scheduling Technology: moving from simulation to optimizationBrenno Menezes
Scheduling technology either commercial or homegrown in today’s crude-oil refining industries relies on a complex simulation of scenarios where the user is solely responsible for making many different decisions manually in the search for feasible solutions over some limited time-horizon i.e., trial-and-error heuristics. As a normal outcome, schedulers abandon these solutions and then return to their simpler spreadsheet simulators due to: (i) time-consuming efforts to configure and manage numerous scheduling scenarios, and (ii) requirements of updating premises and situations that are constantly changing. Moving to solutions based in optimization rather than simulation, the lecture describes the future steps in the refactoring of the scheduling technology in PETROBRAS considering in separate the graphic user interface (GUI) and data communication developments (non-modeling related), and the modeling and process engineering related in an automated decision-making with built-in problem representation facilities and integrated data handling features among other techniques in a smart scheduling frontline.
1. OBM Presentation Checklist
Instructions: You are required to turn this in with the final draft of this project and give it to your TA prior to starting your OBM
PowerPoint presentation. DO NOT staple it to your project. Make sure the following items are placed in your paper. Check each
item off. Then sign at the bottom of the page.
On the day of the Final Fiesta, you are required to turn in a disk. This disk must contain electronic
copies of your:
• Final Fiesta paper
• Final Fiesta Presentation
• SM paper
• SM presentation
• Graph of data for the SM project
• SM performance contract.
Points will be deducted from your Final Fiesta Grade if you fail to turn in this disk.
Requirements Student TA Points
Initials Check earned
verifying off 50 pts
completion possible
Each of the requirements from the Title Slide through the Question Slide MUST be
included
Title Slide (2pt)
Setting and Participant Description (real or hypothetical??) (4pt)
Reason To Intervene (4pt)
Analyze Natural Contingencies applied to your project & (5pt)
Ineffective & Natural Competing Contingencies
Baseline graph included (2)
Specify Performance Objectives applied to your project & (5pt)
Input-Process-Output
Design Intervention applied to your project & contingency (5pt)
diagram
Implement Intervention applied to your project (5pt)
Evaluate Intervention applied to your project & graphs of (5pt)
your data
Recycle applied to your project (either real or (5pt)
hypothetical). Recycle graph included
Graph with recycle phase provided (2)
Question Slide (2pt)
TOTAL: /50 pts.
Bonus Points For the Following:
Choose & Insert some of the following:
Clip Art
Pictures of your site
Charts, diagrams, job aids that were used during the
intervention (1pt per)
Music
2. OBM Paper Checklist
Instructions: You are required to turn this in with your first and final drafts of this paper. It should be placed after your cover page.
Make sure the following items are placed in your paper. Check each item off. Then sign at the bottom of the page.
On the day of the Final Fiesta, you are required to turn in a disk. This disk must contain electronic
copies of your:
• Final Fiesta paper
• Final Fiesta Presentation
• SM paper
• SM presentation
• Graph of data for the SM project
• SM performance contract
Points will be deducted from your Final Fiesta Grade if you fail to turn in this disk.
Required Items Student TA Points earned
Initials Check
verifying off
completion
The cover page through the Graph of data must Diagrams Full 1st Final draft
be included to receive an “A.” Draft draft 100points
20 points 20 points
Cover Page (1pt) (1pt) (2pt)
Paper Checklist (1pt) (1pt) (2pt)
Project Introduction: Site/Problem/Intervention (1pt) (8pt)
Overview - real or hypothetical???
8 terms from Chapter 12 (check the chapter to see which (3pt) (12pt)
terms may or may not be included) defined, explained
and applied to the paper
Input-Process-Output Model Diagram (2pt) (1pt) (3pt)
Input-Process-Output Model Description (1pt) (3pt)
Goal Specification Form Diagram (3pt) (1pt) (3pt)
Goal Specification Form Description (1pt) (3pt)
Cultural Change Model Diagram (3pt) (1pt) (3pt)
Cultural Change Model Description (1pt) (3pt)
6 Steps of Behavior Systems Analysis (1pt) (30pt)
Ineffective Natural Contingency Diagram (2pt) (1pt) (3pt)
Ineffective Natural Contingency description (see section (1pt) (3pt)
in this document)
Natural Competing Contingency Diagram (2pt) (1pt) (3pt)
Natural Competing Contingency description (see section (1pt) (3pt)
in this document)
Three Contingency Model of Performance Management (3pt) (1pt) (3pt)
Diagram
Three Contingency Model of Performance Management (1pt) (3pt)
description
Graph of your data: include baseline, intervention(s), (3pt) (1pt) (10pt)
dotted phase change lines, dotted goal line
Points Earned
Points Possible 20 points 20 points 100 points
Bonus Points For the Following:
Clip Art
Pictures of your site
Charts, diagrams, job aids that were used during the
intervention (1pt per)