SCOPE OF THE PROJECT:
The project is focused on the creation of a Data Warehouse application, for the analysis of property sales in Brooklyn, one of the five boroughs of New York CIty. The project is split into 5 main phases:
Phase 1: Finding the dataset, understanding its structure and what are the meaningful business questions, this dataset could answer.
Phase 2: Extract-Transform-Load processes for the data warehouse, using R Studio.
Phase 3: Building of the Data Warehouse using Microsoft SQL Server.
Phase 4: Building the Multidimensional Cube using Microsoft Analysis Services and Visual Studio.
Phase 5: OLAP Report and Data Visualization (using Tableau).
Basic "how-to" for marketing research.
What is marketing research, why is it important, what are the key elements and where can you find good resources and information. Developed to be taught to patrons - educational on market research concepts but also included some bibliograhic instruction.
All Things Data - Core Tools for Economic Development PractitionersTom Dworetsky, AICP
The recent profusion of data tools has made it easier to make informed economic development decisions than ever before. Yet at the same time, this inundation of information resources can make it difficult to keep track of all the tools at our disposal. What’s an economic developer to do? At the 2016 Northeast Economic Development Association (NEDA) Conference in New Haven, our panel of data junkies led a discussion about what you need to know in order to navigate the complex market of data resources. Through highlighting a variety of free and/or cost-effective data tools that can facilitate economic development decision-making and leading an interactive discussion aimed at developing solutions for your specific data challenges, session attendees came away with an enhanced understanding of how to find and use better data to carry out the mission of their organizations.
Check out the slides from the session!
This is my small project to explain how we can use sql language for data extraction and to improve our business error with the help of data scientist i use mostly Postgres for data extraction ,built in tools help to improve visualisation.
Using the Internet to Research Private CompaniesAugust Jackson
There is a wealth of information that is available about private companies on the Internet. In this presentation I tell you how you can exploit these sources to gain insights into your privately held competitors.
Marketing Research Tools to Grow Your BusinessLady Bizness
This workshop was given as a part of the #Greensboro Chamber of Commerce's #TechThursday series. It empowers the information that business owners need for demographic research, marketing, targeting consumers & customers, and business plan excerpts that will enable financial projections online!
The Business Case For (Or Against) Service Design — Service Design Network 2011brandonschauer
I care about service design because I come at it as a leader of an organization that design services for our clients. Therefore, it’s in my best interest to know how and why it delivers real value. The more value it creates, the more organization will seek out, use, and pay for our work in service design.
I believe strongly that the approaches and mindset of service design can bring about more human and more empathetic services that connect people and business in better ways. But to forward this potential I focused this presentation hard toward the numbers side to find the spaces where service design has the best economic impact.
Analytics and Reporting: How to define metrics that actually mean something t...Mediative
Learn from experts Chris Knoch (VP Marketing, Ready Financial Group), Bill Barnes (VP of Mediative) and Kyle Grant (Senior Sponsored Search Marketing Strategist, Mediative), as they share their insight on:
- Telling a story with your numbers and what they actually mean to the business
- Translating your data into business intelligence
- Defining numbers that impact your bottom line
- Avoiding data paralysis, and much more!
How do you beat competitors that have more money and big name talent? Billy Beane’s famous Moneyball data strategy lead the A’s baseball team to beat bigger and supposedly better teams. How can you beat all of your competition with an economic development Moneyball strategy? And how do you help your local businesses make smarter decisions than their competitors? By breaking the rules, tossing out ineffective tradition, and using big data to set your organization on the path to taking your unfair competitive advantage to crush competitors. Learn how to play to win. More at http://www.GISplanning.com
Real Estate's Big Data Revolution: The New Way to Create ValueHouseCanary
HouseCanary invited a group of leading developers, land investors, builders and architects to discuss how Big Data is revolutionizing the real estate industry on January 21st, 2015.
This is my small project to explain how we can use SQL for data extraction and to improve our business errors with the help of data analysis. It has a good impact on the business and helps it to grow.
I use mostly Postgres for data extraction. built-in tools help to improve visualization.
Economic Gardening Using Information
`
For more information, Please see websites below:
`
Organic Edible Schoolyards & Gardening with Children
http://scribd.com/doc/239851214
`
Double Food Production from your School Garden with Organic Tech
http://scribd.com/doc/239851079
`
Free School Gardening Art Posters
http://scribd.com/doc/239851159`
`
Companion Planting Increases Food Production from School Gardens
http://scribd.com/doc/239851159
`
Healthy Foods Dramatically Improves Student Academic Success
http://scribd.com/doc/239851348
`
City Chickens for your Organic School Garden
http://scribd.com/doc/239850440
`
Simple Square Foot Gardening for Schools - Teacher Guide
http://scribd.com/doc/239851110
These slides were presented during the SEMrush webinar "Storytelling with Data". You can view the video and read the transcript here >>> https://www.semrush.com/webinars/storytelling-with-data/
Region based email and mailing lists can assist you enhance your B2B marketing campaigns and create awareness about your existence by focusing on certain geographic locations. Our Kansas State Email List offers contact details of various business and consumers of different counties within Kansas State. The idea behind compiling state specific email and mailing lists is to simplify your hunt to reach Kansas based business target market. This multi-channel marketing list will help you engage and impress your prospects through various communication avenues email, mail, phone and fax.
info@globalb2bcontacts.com
http://www.globalb2bcontacts.com
https://globalb2bcontacts.com/sub-industry-email-database.html
WHITE PAPER_ Shopping Centres - Inconvenient TruthHenry Leo
With access to superior data, the team at Spectrum Analysis guide you to make the best site selection & territory planning decisions in the retail & franchise space.
How did the 2015 Real Estate Market end up in the Greater Toronto Area in 2015? What is the Outlook for 2016? This report can help you with both those questions!
In this study, we attempted to study the network of Twitter users and the mentions between them. Starting with a very large and incorrectly structured dataset, we used the Unix terminal (sed) and regular expressions to efficiently perform filtering and various transformations to end up with a lighter dataset. Then, using Python, we completely transformed the dataset from a linear (line by line) to a tabular format (columns), in order to load the data in iGraph. Using iGraph, we created a weighted directed graph and performed various tasks to explore the network:
- Identifying basic properties of the network, such as the Number of vertices, Number of edges, Diameter of the graph, Average in-degree and Average out-degree.
- Visualising the 5-day evolution of these metrics and commenting on observed fluctuations.
- Identifying the important nodes of the graph, based on In-degree, Out-degree and PageRank
- Performing community detections on the mention graphs, by applying fast greedy clustering, infomap clustering, and louvain clustering on the undirected versions of the 5 mention graphs.
- Visualising the different communities in the mention graph.
The scope of this project is to use the data concerning suicides in Greece and the EU, to study whether there is an increasing trend of suicides in Greece, or better whether the behaviour of Greece is different than the rest of Europe. The output should be a report with at least 10 visualisations to study the data and create an informed opinion about the topic in a concrete and convincing manner.
The key findings of the project are:
* Greece has lower suicides than the EU as a whole.
* Countries in the Mediterranean have fewer suicides, while Nothern Europe has higher suicide rates.
* Men are significantly more likely to commit suicide compared to women.
* The over-representation of men is universal across all countries in the EU.
* The likelihood of suicide increases with age.
* While not true for the whole EU, in Greece the economic slowdown might relate to an increase in suicides.
More Related Content
Similar to Brooklyn Property Sales - DATA WAREHOUSE (DW)
Basic "how-to" for marketing research.
What is marketing research, why is it important, what are the key elements and where can you find good resources and information. Developed to be taught to patrons - educational on market research concepts but also included some bibliograhic instruction.
All Things Data - Core Tools for Economic Development PractitionersTom Dworetsky, AICP
The recent profusion of data tools has made it easier to make informed economic development decisions than ever before. Yet at the same time, this inundation of information resources can make it difficult to keep track of all the tools at our disposal. What’s an economic developer to do? At the 2016 Northeast Economic Development Association (NEDA) Conference in New Haven, our panel of data junkies led a discussion about what you need to know in order to navigate the complex market of data resources. Through highlighting a variety of free and/or cost-effective data tools that can facilitate economic development decision-making and leading an interactive discussion aimed at developing solutions for your specific data challenges, session attendees came away with an enhanced understanding of how to find and use better data to carry out the mission of their organizations.
Check out the slides from the session!
This is my small project to explain how we can use sql language for data extraction and to improve our business error with the help of data scientist i use mostly Postgres for data extraction ,built in tools help to improve visualisation.
Using the Internet to Research Private CompaniesAugust Jackson
There is a wealth of information that is available about private companies on the Internet. In this presentation I tell you how you can exploit these sources to gain insights into your privately held competitors.
Marketing Research Tools to Grow Your BusinessLady Bizness
This workshop was given as a part of the #Greensboro Chamber of Commerce's #TechThursday series. It empowers the information that business owners need for demographic research, marketing, targeting consumers & customers, and business plan excerpts that will enable financial projections online!
The Business Case For (Or Against) Service Design — Service Design Network 2011brandonschauer
I care about service design because I come at it as a leader of an organization that design services for our clients. Therefore, it’s in my best interest to know how and why it delivers real value. The more value it creates, the more organization will seek out, use, and pay for our work in service design.
I believe strongly that the approaches and mindset of service design can bring about more human and more empathetic services that connect people and business in better ways. But to forward this potential I focused this presentation hard toward the numbers side to find the spaces where service design has the best economic impact.
Analytics and Reporting: How to define metrics that actually mean something t...Mediative
Learn from experts Chris Knoch (VP Marketing, Ready Financial Group), Bill Barnes (VP of Mediative) and Kyle Grant (Senior Sponsored Search Marketing Strategist, Mediative), as they share their insight on:
- Telling a story with your numbers and what they actually mean to the business
- Translating your data into business intelligence
- Defining numbers that impact your bottom line
- Avoiding data paralysis, and much more!
How do you beat competitors that have more money and big name talent? Billy Beane’s famous Moneyball data strategy lead the A’s baseball team to beat bigger and supposedly better teams. How can you beat all of your competition with an economic development Moneyball strategy? And how do you help your local businesses make smarter decisions than their competitors? By breaking the rules, tossing out ineffective tradition, and using big data to set your organization on the path to taking your unfair competitive advantage to crush competitors. Learn how to play to win. More at http://www.GISplanning.com
Real Estate's Big Data Revolution: The New Way to Create ValueHouseCanary
HouseCanary invited a group of leading developers, land investors, builders and architects to discuss how Big Data is revolutionizing the real estate industry on January 21st, 2015.
This is my small project to explain how we can use SQL for data extraction and to improve our business errors with the help of data analysis. It has a good impact on the business and helps it to grow.
I use mostly Postgres for data extraction. built-in tools help to improve visualization.
Economic Gardening Using Information
`
For more information, Please see websites below:
`
Organic Edible Schoolyards & Gardening with Children
http://scribd.com/doc/239851214
`
Double Food Production from your School Garden with Organic Tech
http://scribd.com/doc/239851079
`
Free School Gardening Art Posters
http://scribd.com/doc/239851159`
`
Companion Planting Increases Food Production from School Gardens
http://scribd.com/doc/239851159
`
Healthy Foods Dramatically Improves Student Academic Success
http://scribd.com/doc/239851348
`
City Chickens for your Organic School Garden
http://scribd.com/doc/239850440
`
Simple Square Foot Gardening for Schools - Teacher Guide
http://scribd.com/doc/239851110
These slides were presented during the SEMrush webinar "Storytelling with Data". You can view the video and read the transcript here >>> https://www.semrush.com/webinars/storytelling-with-data/
Region based email and mailing lists can assist you enhance your B2B marketing campaigns and create awareness about your existence by focusing on certain geographic locations. Our Kansas State Email List offers contact details of various business and consumers of different counties within Kansas State. The idea behind compiling state specific email and mailing lists is to simplify your hunt to reach Kansas based business target market. This multi-channel marketing list will help you engage and impress your prospects through various communication avenues email, mail, phone and fax.
info@globalb2bcontacts.com
http://www.globalb2bcontacts.com
https://globalb2bcontacts.com/sub-industry-email-database.html
WHITE PAPER_ Shopping Centres - Inconvenient TruthHenry Leo
With access to superior data, the team at Spectrum Analysis guide you to make the best site selection & territory planning decisions in the retail & franchise space.
How did the 2015 Real Estate Market end up in the Greater Toronto Area in 2015? What is the Outlook for 2016? This report can help you with both those questions!
In this study, we attempted to study the network of Twitter users and the mentions between them. Starting with a very large and incorrectly structured dataset, we used the Unix terminal (sed) and regular expressions to efficiently perform filtering and various transformations to end up with a lighter dataset. Then, using Python, we completely transformed the dataset from a linear (line by line) to a tabular format (columns), in order to load the data in iGraph. Using iGraph, we created a weighted directed graph and performed various tasks to explore the network:
- Identifying basic properties of the network, such as the Number of vertices, Number of edges, Diameter of the graph, Average in-degree and Average out-degree.
- Visualising the 5-day evolution of these metrics and commenting on observed fluctuations.
- Identifying the important nodes of the graph, based on In-degree, Out-degree and PageRank
- Performing community detections on the mention graphs, by applying fast greedy clustering, infomap clustering, and louvain clustering on the undirected versions of the 5 mention graphs.
- Visualising the different communities in the mention graph.
The scope of this project is to use the data concerning suicides in Greece and the EU, to study whether there is an increasing trend of suicides in Greece, or better whether the behaviour of Greece is different than the rest of Europe. The output should be a report with at least 10 visualisations to study the data and create an informed opinion about the topic in a concrete and convincing manner.
The key findings of the project are:
* Greece has lower suicides than the EU as a whole.
* Countries in the Mediterranean have fewer suicides, while Nothern Europe has higher suicide rates.
* Men are significantly more likely to commit suicide compared to women.
* The over-representation of men is universal across all countries in the EU.
* The likelihood of suicide increases with age.
* While not true for the whole EU, in Greece the economic slowdown might relate to an increase in suicides.
Predicting US house prices using Multiple Linear Regression in RSotiris Baratsas
In this study, we attempted to formulate a Multiple Linear Regression model, to predict US house prices.
Steps involved:
Perform descriptive analysis and visualisation for each variable to get an initial insight of what the data looks like.
Conduct pairwise comparisons between the variables in the dataset to investigate if there are any associations implied by the dataset.
Construct a model for the expected selling prices according to the remaining features. Check whether this linear model fits well to the data.
Find the best model for predicting the selling prices and select the appropriate features using stepwise methods (used Forward, Backward and Stepwise procedures according to AIC or BIC to choose which variables appear to be more significant for predicting selling prices).
Get the summary of our final model, interpret the coefficients. Comment on the significance of each coefficient and write down the mathematical formulation of the model. Consider whether the intercept should be excluded from our model.
Check the assumptions of your final model. Are the assumptions satisfied? If not, what is the impact of the violation of the assumption not satisfied in terms of inference? What could someone do about it?
Conduct LASSO as a variable selection technique and compare the variables that we end up having using LASSO to the variables that you ended up having using stepwise methods.
We have access to a data stream that’s generated from sensors placed in some checkpoints (toll stations and speed cameras). Each time a car passes by one checkpoint, an event is generated. All cars are equipped with tags that provide the vehicleTypeID and colorID of each car. Tag readers are capable of reading this information. In addition, a camera reads the license plate and completed the event’s data. We are asked to create an Azure Analytics solution for the tasks listed in the “QUERIES” section.
A car rental company wants to develop a relational database to monitor customers, rentals, fleet and locations. The company's fleet consists of cars of different types. A car is described via a unique code (VIN), a description, color, brand, model, and date of purchase. A car may belong to one (exactly one) vehicle category (compact, economy, convertible, etc.). Each category is described by a unique ID, a label and a detailed description. The company has several locations around the globe. Each location has a unique ID, an address (street, number, city, state, country) and one or more telephone numbers. The company should also store in this database its customers. A customer is described by a unique ID, SSN, Name (First, Last), email, mobile phone number and lives in a state and country. Customers rent a car, which they pickup from a location and return it another location (not necessarily the same.) A rental is described by a unique reservation number, it has an amount and contains the pickup date and the return date. Entity-Relationship Diagram (ERD) Use the Entity-Relationship Diagram (ERD) to model entities, relationships, attributes, cardinalities, and all necessary constraints. Use any tool you like to draw the ERD.
Predicting Customer Churn in Telecom (Corporate Presentation)Sotiris Baratsas
This study attempted to formulate a predictive model that identifies whether a customer is probable to switch telecommunications providers (Churn) or stay with the company. We started with a Logistic Regression classifier, and moved on to methods such as Decision Tree, Random Forest, XGBoost, Adaboost, SVM, KNN and Naive- Bayes. We concluded that the best predictive model we could find was XGBoost, which manages to identify correctly almost all the non-churners and the vast majority of the churners. Closely trailing was the Decision Tree model, which is more easily interpretable and applicable in real business problems.
On the other hand, Cluster Analysis was a bit more challenging. The Hierarchical Clustering methods we used weren’t very effective. Using the Mahalanobis distance and the Gower distance, we managed to produce 2 clustering methods with Silhouette values equal to 0.2. Using the K-Means method, the results became a little bit better, especially using Principal Components and creating 4 clusters.
Understanding Customer Churn in Telecom - Corporate PresentationSotiris Baratsas
This study attempted to formulate a regression model that identifies the characteristics that influence whether a customer is probable to switch telecommunications providers (Churn). We started with a full model, performed variable selection using AIC and BIC through the stepwise method and moved on to other models, using LASSO, or a few simple (aggregate) transformation of certain predictor variables. We concluded that the best logistic regression model we could find was produced with an aggregate transformation of the variables that concern domestic charges for various times of the day (Day Charges, Evening Charges, Night Charges).
Simple tips and tricks you can use to avoid making the audience pull their eyes out with your horrible Powerpoint Presentation!
This is part of a training for "Presentation skills" delivered to the members of AIESEC in Athens in February 2015.
There are some videos inside the presentation. Since they don't work on this format, if you want the links for them, feel free to contact me!
How to use the techniques of the world's most captivating speakers to take your presentations to the next level!
This is part of a training for "Presentation skills" delivered to the members of AIESEC in Athens in February 2015.
There are some videos inside the presentation. Since they don't work on this format, if you want the links for them, feel free to contact me!
This is a presentation accompanying an assignment for the course "Macroeconomics in English" of the Athens University of Economics and Business. The purpose of the presentation is to present the key points of the assignments and accompany the public presentation of the assignment.
This is a presentation for Global Talent, which is the Global Internship Program of AIESEC. The purpose of the presentation was to deliver the value proposition of the program and convince the attendees of the event to participate in it.
This was a presentation delivered to the participants of the Welcome Week for Freshmen of Athens University of Economics and Business in 2014. The purpose of the presentation was for the audience to learn about AIESEC and the opportunities they have while being students.
This is a training created and delivered at The Solution Is Leadership (TSIL) Conference 2014. The audience was consisted by Team Leaders and Team Members of AIESEC in Athens. The audience understood the importance of feedback, the basic rules of giving and receiving feedback and had the chance to experience some advanced methods for feedback giving and team assessment.
Global Youth to Business Forum Sponsorship PackageSotiris Baratsas
I was assigned to create the new sponsorship package for the Global Youth to Business Forum of 2015, Powered by AIESEC. I spent a week in Rotterdam working with AIESEC International, analyzing data, creating outlines and bringing it all together in one booklet! This is it!
This was my presentation in the "Ready, Set, Go Global" event that took place in Athens University of Economics and Business. I was the chair of the event, which aimed to connect the Greek youth to the market.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
1. Front page
f o r B r o o k l y n P r o p e r t y S a l e s D a t a
DATA WAREHOUSE
BARATSAS
SOTIRIS
SPANOS
NIKOS
A T H E N S U N I V E R S I T Y O F E C O N O M I C S A N D B U S I N E S S
M S c i n B u s i n e s s A n a l y t i c s
P R O J E C T
D A T A M A N A G E M E N T & B U S I N E S S I N T E L L I G E N C E
2. Who we are
Sotiris Baratsas
sotbaratsas@gmail.com
Nikos Spanos
nickosspan@gmail.com
5. The Business Challenge
Make better decisions
Which areas to focus
our marketing efforts
and manpower on?
Which kind of properties
would be the best use of
our time & budget?
How can we take
advantage of market
changes quickly?
Data-driven pricing for increased commissions
and shorter order fulfillment time
7. Dataset & Challenges
SOURCE
NEW YORK CITY
DEPARTMENT OF FINANCE
ANNUAL ROLLING SALES DATA
AGGREGATED FOR 2003-2017
*Dataset Link: https://www.kaggle.com/tianhwu/brooklynhomes2003to2017
8. Dataset & Challenges
SOURCE
NEW YORK CITY
DEPARTMENT OF FINANCE
ANNUAL ROLLING SALES DATA
AGGREGATED FOR 2003-2017
VERY LARGE DATASET
390.883 observations
*Dataset Link: https://www.kaggle.com/tianhwu/brooklynhomes2003to2017
9. Dataset & Challenges
SOURCE
NEW YORK CITY
DEPARTMENT OF FINANCE
ANNUAL ROLLING SALES DATA
AGGREGATED FOR 2003-2017
VERY LARGE DATASET
390.883 observations
*Dataset Link: https://www.kaggle.com/tianhwu/brooklynhomes2003to2017
TAX-FOCUSED DATASET
111 columns with mostly tax-related,
bureaucratic or duplicate variables
10. Dataset & Challenges
SOURCE
NEW YORK CITY
DEPARTMENT OF FINANCE
ANNUAL ROLLING SALES DATA
AGGREGATED FOR 2003-2017
VERY LARGE DATASET
390.883 observations
*Dataset Link: https://www.kaggle.com/tianhwu/brooklynhomes2003to2017
TAX-FOCUSED DATASET
111 columns with mostly tax-related,
bureaucratic or duplicate variables
COMPLICATED TAX SYSTEM
We needed to read a lot about legal terms of the NYC
tax system and extract data from additional sources
11. Cleaning the data
20
highly valuable
columns
11
dimensions
9
measures
We removed irrelevant and duplicate columns,
and ended up with
13. Let’s take a lookLet’s take a lookLet’s take a lookLet’s take a look
14. Cleaning the data
Replaced missing
measure values with NULL
Replaced missing
dimension values with NA
Removed false values
(years, ZIP Codes, districts)
Extracted data from
additional data sources
(bldg classes, lot type, land use)
Extracted ”Month Sold”
and ”Year Sold” columns
Identified the correct data
type for each column
40. Client Examples
A client wants to sell her property.
Aware of the Assessed Value that the city’s finance
department has placed on her lot, she wants to set
the starting price equal to the Assessed Value.
42. Client Examples
A client comes to us, to sell his property.
He tells us he has a 2-floor, inside lot in East New York.
To get the job, our agent has to show he has a good
understanding of the prices of this type of property.
43. Client Examples
Our agent can access
valuable information in
seconds, appear as an
expert and make smarter
pricing decisions.
44. Front page
f o r y o u r a t t e n t i o n
THANK YOU
BARATSAS
SOTIRIS
SPANOS
NIKOS