Machine Learning Project by Group E
* disclaimer:
The professor later told us that there were some improvements or missing details that should have been added to the regression analysis. But, this was our initial deliverable.
Product Analysis / Product Management for Laundry Marketplace App - JUST CLEAN, A Kuwait based startup. The home screen of the app is very critical part of User Experience and with data analysis and qualitative analysis we will see how we can guide the product management in the right direction.
RealZips GeoData Platform™ for Salesforce.com
The native RealZips app instantly provides Zip codes and business GeoData. Recently awarded App of the Week. Drive workflows: territory management, web-to-lead, lead assignment & ownership alignment. Start free 30 day trial.
Recruitment,Recruiting,Recruitment business intelligence,Recruitment metrics,...Cost per hire
Gain complete recruitment function control
Improve recruitment performance
Optimize recruitment operations
Calculate cost per hire & invoke recruitment analytics
Enhance hiring process efficiency & recruitment business controls
Generate multi dimensional graphical reports
www.costperhire.biz
Product Analysis / Product Management for Laundry Marketplace App - JUST CLEAN, A Kuwait based startup. The home screen of the app is very critical part of User Experience and with data analysis and qualitative analysis we will see how we can guide the product management in the right direction.
RealZips GeoData Platform™ for Salesforce.com
The native RealZips app instantly provides Zip codes and business GeoData. Recently awarded App of the Week. Drive workflows: territory management, web-to-lead, lead assignment & ownership alignment. Start free 30 day trial.
Recruitment,Recruiting,Recruitment business intelligence,Recruitment metrics,...Cost per hire
Gain complete recruitment function control
Improve recruitment performance
Optimize recruitment operations
Calculate cost per hire & invoke recruitment analytics
Enhance hiring process efficiency & recruitment business controls
Generate multi dimensional graphical reports
www.costperhire.biz
We've added infographics and presentation examples to our arsenal! Still included are campaign analysis dashboards, media audits, membership analysis, and mapping with data.
Content marketing analytics: what you should really be doingDaniel Smulevich
My presentation from Digital Marketing Show 2014. #DMSLDN
A journey through web analytics processes, from setting up KPIs to integrating data sources and automating reports.
The real estate industry is fast-moving and competitive and performance is paramount. To be successful in this environment, real estate brokerages must continue innovating, keeping their sales force motivated, and ensuring sales reps embrace the firm’s goals. Brokerages must also keep sales reps focused on the deal — increasing the profitability of transactions and negotiating best prices for their clients.
Data Science Salon: Enabling self-service predictive analytics at BidtellectFormulatedby
Having previously worked at both Millennial Media and AOL, Michael Conway brought his expertise to Bidtellect tasked with transforming the business to a self-service SaaS-based content distribution platform, enabling the company to grow 10-fold.
Next DSS MIA Event - https://datascience.salon/miami/
During the 30-minute presentation, Michael will provide background information about Bidtellect and how data is an integral component of the business managing their premium native inventory across their supply ecosystem with over 5 billion native auctions per day. As Bidtellect embraces big data, Michael will share the challenges and successes he and his team have experienced along the way. In addition, Steve Sarsfield, Vertica Senior Product Marketing Manager, will be available to discuss how specific technologies (SQL, Python, R and embedded algorithms) can be combined in an MPP environment to achieve big data analytics success.
Tetention Marketing and Profit Optimization
Quantitative experts will frame and hopefully synthesize an understanding of how to properly integrate the costs of turnover with pricing optimization. Industry leaders will share information on variable and fixed turn costs, advertising strategies and when it might make sense to bite the bullet and let a resident go in favor of a projected rental increase. Readers of the AIM LinkedIn discussion group will recognize the discussion that started online and more fully considered in person.
- Richard Hughes, Vice President of Revenue Management, AMLI
- Doug Miller, President, Satisfacts
- Greg Lozinak, Chief Operating Officer, Waterton Residential
Defining Target Market for Telemarketing CampaignsMelody Ucros
IE Business School MBD Program
Retail Analytics Project O1 Group C:
Annie Pi – Anchal Jaiswal – Cedric Viret – Melody Ucros – Miguel Martin Romero – Pablo Dosal - Victor Kausch
More Related Content
Similar to Using Regression for Identifying Opportunities in Real Estate
We've added infographics and presentation examples to our arsenal! Still included are campaign analysis dashboards, media audits, membership analysis, and mapping with data.
Content marketing analytics: what you should really be doingDaniel Smulevich
My presentation from Digital Marketing Show 2014. #DMSLDN
A journey through web analytics processes, from setting up KPIs to integrating data sources and automating reports.
The real estate industry is fast-moving and competitive and performance is paramount. To be successful in this environment, real estate brokerages must continue innovating, keeping their sales force motivated, and ensuring sales reps embrace the firm’s goals. Brokerages must also keep sales reps focused on the deal — increasing the profitability of transactions and negotiating best prices for their clients.
Data Science Salon: Enabling self-service predictive analytics at BidtellectFormulatedby
Having previously worked at both Millennial Media and AOL, Michael Conway brought his expertise to Bidtellect tasked with transforming the business to a self-service SaaS-based content distribution platform, enabling the company to grow 10-fold.
Next DSS MIA Event - https://datascience.salon/miami/
During the 30-minute presentation, Michael will provide background information about Bidtellect and how data is an integral component of the business managing their premium native inventory across their supply ecosystem with over 5 billion native auctions per day. As Bidtellect embraces big data, Michael will share the challenges and successes he and his team have experienced along the way. In addition, Steve Sarsfield, Vertica Senior Product Marketing Manager, will be available to discuss how specific technologies (SQL, Python, R and embedded algorithms) can be combined in an MPP environment to achieve big data analytics success.
Tetention Marketing and Profit Optimization
Quantitative experts will frame and hopefully synthesize an understanding of how to properly integrate the costs of turnover with pricing optimization. Industry leaders will share information on variable and fixed turn costs, advertising strategies and when it might make sense to bite the bullet and let a resident go in favor of a projected rental increase. Readers of the AIM LinkedIn discussion group will recognize the discussion that started online and more fully considered in person.
- Richard Hughes, Vice President of Revenue Management, AMLI
- Doug Miller, President, Satisfacts
- Greg Lozinak, Chief Operating Officer, Waterton Residential
Defining Target Market for Telemarketing CampaignsMelody Ucros
IE Business School MBD Program
Retail Analytics Project O1 Group C:
Annie Pi – Anchal Jaiswal – Cedric Viret – Melody Ucros – Miguel Martin Romero – Pablo Dosal - Victor Kausch
IE Business School Masters in Big Data and Business Analytics
Digital Analytics Project by Group F:
Melody Ucros
Jina Kim
Andrea Blasioli
Adedeji Rodemade
Fergus Buckey
Alex Kyalo
Louis Rampignon
Customer Segmentation for Retention StrategyMelody Ucros
IE Business School
Marketing Intelligence Project by Group F:
Melody Ucros
Jina Kim
Andrea Blasioli
Adedeji Rodemade
Fergus Buckey
Alex Kyalo
Louis Rampignon
Data Source: http://archive.ics.uci.edu/ml/datasets/online+retail
Understanding The Future of Production (4IR)Melody Ucros
Differing trajectories in production will play themselves out, based on the complex and volatile external environment. Here are four contrasting scenarios. (Created For: Economic Environment Course @ IE Business School)
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Using Regression for Identifying Opportunities in Real Estate
1. Presented by:
Bruno Cervantes Quino
Bruno Gobbet Gianini
Federico J. Garcia Lopez
George Kofi Akanza
Kamal Nandan
Melody Ucros
Peder Viland
Identifying New Opportunities
REGRESSION REPORT 2017
To: Sales Director
2. Executive Summary ...….….…………….…………….………………..…………….……….…. 3
Recommendations …….……………………………………….………………………………...... 4
Targeted Campaigns…………………....………….……….…………….………..... 4
Internal Alert System ………………..…………………….…………….………...... 4
Agent Compensation ..…………….…….....……………….………………………. 4
Analyst Approach ………...………..….……………….………….…………………………..…... 5
Data Preparation ……………………………………………………………….……..... 5
Variable Selection ………………………………………………………………….…… 5
Model Creation …………………………………………………………….......……… 5
Model Validation ………………………………………………………….…….…..... 5
Annex …………………………….…….……………..……………………………………….……..…. 6
Table of Content
Regression Report 2017 2
3. Executive Summary
As a Real Estate Agency, identifying good opportunities to help clients buy or sell a home is a key
task of our value proposition. The team of analysts has created a model with which we can
estimate rental prices based on a number of factors and gain a competitive advantage with our
offers in the market.
To create this tool, we used a dataset from idealista.com. The dataset consisted of numerous
descriptive variables that could influence price but, based on their statistical significance, the
team has only kept the following variables and some of their interactions:
• Sq. Meters of the property
• Area where the property is located
• Bedrooms that the property has
• If it is an Outer Property
• If it is a Penthouse or a Duplex
Note that certain descriptors have more influence over others when estimating price. For
example, key observations regarding the models predictive ability is that:
• When a property is a penthouse, or a property which we know the sq.mt. in the Area
of Retiro, our models estimation is the most accurate.
• When we know the sq.mt of properties in the Areas of Salamanca, Tetuan,
Chamartín, Centro and Hortaleza, our estimations can also be highly trusted.
• Our model works best for properties with rental prices below 10,000 or when the
property is above 10,000 Sq.Mts.
Therefore, these are the three initiatives that could help the agency gain a competitive
advantage when compared to competitors:
• Create Targeted Marketing Campaigns for these specific areas informing tenants
about the potential of their property in regards to buying, selling or renting prices.
• Build an internal tool where properties that meet a certain sq.mt threshold and
marked at a price below area standards, are automatically added to the pipeline of
opportunities.
• Establish Quarterly Goals for the agents to meet in these areas, since we can
significantly improve our offense strategy when finding properties now.
In conclusion, we can expect these initiatives to improve the ROI of the agents time, the
proactivity with which we approach clients, and the offers that we can negotiate in the market.
Regression Report 2017 3
4. Recommendations
Targeted Campaigns
Branding ourselves as helpful and knowledgeable about the market can help us attract new
clients and grow our business. The best way to do this is through targeted Facebook Campaigns,
with an informative video of different neighborhoods, a walking tour, the average prices, and how
to maximize the value of properties. Another way to do this is through an email or mailbox
campaign, but it would be a bit harder to measure our efforts and allocate our resources.
Therefore, if the second option in preferred, the best strategy would be to partner with specific
building owners or local businesses in these different areas. The information from the model will
be used in the video but also once these new clients approach us, so that we can more accurately
assess the opportunity in hand.
Internal Alert System
Creating an internal tool to keep track of good opportunities is essential. This can be done by
using software that can be configured to automatically discover and crawl real estate websites.
The initial configuration usually allows you to set data parameters like city, state, zip code, selling
price, rent price, address, property size and characteristics. Based on these parameters, listings
can be saved to database with normalized fields for easier access and search capabilities. The
information from the model will be used to create those specific parameters per area, in order to
automatically issue an alert once a good opportunity is identified.
Agent Compensation
Real Estate is a people-business, and that means treating agents and clients at a superb level.
The information on rental prices that the model provides can be used to create a more
personalized compensation structure for finding those “anomalies” in the market, and closing
them. We would already be providing them the tool to identify some, but it is up to them to
bring that business to our agency. This compensation structure can be driven by quarterly goals
that agents should fulfill in different areas, and the recruitment of clients that might not be
listed online.
Regression Report 2017 4
5. Analyst Approach
Data Preparation
The original dataset had the following information: ID, Area, Address, Number, Zone, Rent,
Bedrooms, Sq. Mt. , Floor, Outer, Elevator, Penthouse, Cottage, Duplex, and Semi-Detached. We
removed the columns that were un-factorable, were too narrow for our analysis, or had too many
missing variables. The dataset wasn’t too big, so we used Excel to clean it manually. We then
imported the prepared dataset into R Studio and factored several fields to be able to work with
them.
Variable Selection
In order to choose which variables to keep, we used Stepwise Forward method in R, which simply
kept adding variables that were significant until it could no longer be improved. This was the
final output: Rent ~ Sq..Mt. + Area + Outer + Bedrooms + Penthouse + Duplex. This was giving
us an R2 of .76, which meant that only 76% of the variance in rental prices could be explained
by our model. Stepwise method doesn’t test for interactions though, so we manually tested
different interactions of the variables and kept the ones that improved the R2.
Model Creation
To compare the original model with the manually modified one, we used an Anova test. Because
the p-value resulted in < 5%, then the variables added did matter and the new model was
significantly better. This was the final model used:
Rent ~ Sq. Mt. + Area + Outer + Bedrooms + Penthouse + Duplex + Area: Sq. Mt.
Model Validation
To validate our model we used K-Fold Cross Validation, dividing our dataset 80/20 for training
and testing. When testing, our final model yielded an R2 that ranged from .83 to .87. We also
created some graphs to visually assess the models prediction abilities. One helped us see the
average price per area, and another one which values might be having an impact in the curve of
our model. Another important conclusion from one of the graphs was that our model works best
for properties with rental prices below 10,000 or when the property is above 10,000 Sq.Mts.
Regression Report 2017 5