The document describes a project called The Asset Consultancy which aims to analyze and predict current and future prices of real estate properties. It will fetch data from various sources and use Hadoop and its components like Hive and Sqoop to analyze the data and produce predictions using algorithms. This will help customers determine if investing in a particular property is advantageous or not. The document outlines the technology used, system architecture, current scenario, new proposed flow, features, implementation of big data using Hadoop, and future scope of the project.
DataSift Update - May 3rd 2011 - DevnestOllie Parsley
The document discusses Ollie Parsley and DataSift, a company that provides real-time social media data mining of 170 million tweets daily. It highlights DataSift's tools like BigStore for storing large amounts of tweet data in HBase, and StreamTags for filtering and analyzing tweet streams. The document also lists several potential use cases for DataSift's services across various industries and provides information about open job positions at the company and how to request a DataSift invite.
Demystify Big Data Breakfast Briefing: Martha Bennett, Forrester Hortonworks
The document discusses big data trends and developments. It notes that while firms recognize the importance of data, they currently only utilize a small percentage of the data available to them. It also discusses how data sources are continuing to multiply and how most business intelligence remains backward-looking. The document outlines some key shifts in big data and analytics, including self-service tools and predictive analytics. It provides examples of how some companies are using big data technologies and lessons learned, emphasizing that skills, security, and business context are important considerations.
Tweet alert - semantic analysis in social networks for citizen opinion miningSngular Meaning
Description of a configurable, real-time system for automatic record, analysis and visualization of information from user interactions in Twitter. The system is designed to provide public bodies (government agencies) with a powerful tool to rapidly and easily understand what the citizen behavior trends are, what their opinion about city services, events, etc. is, and also may be used as a primary alert system to improve the efficiency of emergency systems. The citizen is here observed as a proactive city sensor capable of generating huge amounts of very rich, high-level and valuable data through social media platforms, which, after properly processed, summarized and annotated, allows city officers to better understand citizen needs. The architecture and component blocks are described and some key details of the design, implementation and scenarios of application are discussed. Textalytics APIS are used for the semantic analysis of relevant tweets.
Presentation by DAEDALUS, UPM and UC3M at PEGOV 2014, 2nd International Workshop on Personalization in eGovernment Services and Applications, Aalborg, Denmark, in conjunction with the 22nd Conference on User Modeling, Adaptation and Personalization - UMAP 2014.
Demo or Die: Where advertising meets product designChristine Outram
This presentation explores the role of rapid prototyping in the age of digital advertising and how it is transforming a "traditional creative process" into a lean, interactive, and multidisciplinary endeavor. Advertising is evolving; the best ads are not always ads; demo or die.
An overview of traditional spatial analysis tools, an intro to hadoop and other tools for analyzing terabytes or more of data, and then a primer with examples on combining the two with data pulled from the Twitter streaming API. Given at the O'Reilly Where 2.0 conference in March 2010.
Flume NG is a tool for collecting and moving large amounts of log data from distributed servers to a Hadoop cluster. It uses agents that collect data through sources like netcat, store data temporarily in channels like memory, and then write data to sinks like HDFS. Flume provides reliable data transport through its use of transactions and flexible configuration through sources, channels, and sinks.
DataSift Update - May 3rd 2011 - DevnestOllie Parsley
The document discusses Ollie Parsley and DataSift, a company that provides real-time social media data mining of 170 million tweets daily. It highlights DataSift's tools like BigStore for storing large amounts of tweet data in HBase, and StreamTags for filtering and analyzing tweet streams. The document also lists several potential use cases for DataSift's services across various industries and provides information about open job positions at the company and how to request a DataSift invite.
Demystify Big Data Breakfast Briefing: Martha Bennett, Forrester Hortonworks
The document discusses big data trends and developments. It notes that while firms recognize the importance of data, they currently only utilize a small percentage of the data available to them. It also discusses how data sources are continuing to multiply and how most business intelligence remains backward-looking. The document outlines some key shifts in big data and analytics, including self-service tools and predictive analytics. It provides examples of how some companies are using big data technologies and lessons learned, emphasizing that skills, security, and business context are important considerations.
Tweet alert - semantic analysis in social networks for citizen opinion miningSngular Meaning
Description of a configurable, real-time system for automatic record, analysis and visualization of information from user interactions in Twitter. The system is designed to provide public bodies (government agencies) with a powerful tool to rapidly and easily understand what the citizen behavior trends are, what their opinion about city services, events, etc. is, and also may be used as a primary alert system to improve the efficiency of emergency systems. The citizen is here observed as a proactive city sensor capable of generating huge amounts of very rich, high-level and valuable data through social media platforms, which, after properly processed, summarized and annotated, allows city officers to better understand citizen needs. The architecture and component blocks are described and some key details of the design, implementation and scenarios of application are discussed. Textalytics APIS are used for the semantic analysis of relevant tweets.
Presentation by DAEDALUS, UPM and UC3M at PEGOV 2014, 2nd International Workshop on Personalization in eGovernment Services and Applications, Aalborg, Denmark, in conjunction with the 22nd Conference on User Modeling, Adaptation and Personalization - UMAP 2014.
Demo or Die: Where advertising meets product designChristine Outram
This presentation explores the role of rapid prototyping in the age of digital advertising and how it is transforming a "traditional creative process" into a lean, interactive, and multidisciplinary endeavor. Advertising is evolving; the best ads are not always ads; demo or die.
An overview of traditional spatial analysis tools, an intro to hadoop and other tools for analyzing terabytes or more of data, and then a primer with examples on combining the two with data pulled from the Twitter streaming API. Given at the O'Reilly Where 2.0 conference in March 2010.
Flume NG is a tool for collecting and moving large amounts of log data from distributed servers to a Hadoop cluster. It uses agents that collect data through sources like netcat, store data temporarily in channels like memory, and then write data to sinks like HDFS. Flume provides reliable data transport through its use of transactions and flexible configuration through sources, channels, and sinks.
This document summarizes research on using Twitter data from the Czech Republic for data mining purposes. It finds that there are approximately 10,000 active Twitter users in the Czech Republic, with 44% of tweets in Czech and 33% in English. Location data shows the majority of tweets come from Prague, Brno, and Ostrava. Different data mining methods like frequency analysis and semantic similarity are used to analyze topics and identify opinion leaders. The document also finds that Czechs tweet most on Tuesdays and Thursdays, and that Twitter can sometimes predict future search trends based on concerts and events more quickly than Google.
Social media data for Social science researchDavide Bennato
This is the talk I gave at the Lipari Summer School on Computational social science 2013. What are relationship between social science and big data? With a focus on Twitter and its social media mining tools
http://www.tecnoetica.it/2013/08/07/lipari-summer-school-computational-social-science-big-data-e-twitter/
This seminar report discusses mining big data from Twitter using a Twitter analyzer tool. The report outlines the current status of big data analysis, which is challenging with normal data mining tools. The report proposes developing a Twitter analyzer application to more efficiently analyze large amounts of Twitter data. The application would connect to a database and allow users to view influence analysis and results to determine the effectiveness of social media strategies. Future enhancements could expand the tool to predict outcomes beyond box office revenues.
With the tremendous growth of social networks, there has been a growth in the amount of new data that is being created every minute on these networking sites. The notion of community in this social networking world has caught lots of attention. Studying Twitter is useful for understanding how people use new communication technologies to form social connections and maintain existing ones. We analysed how geo-tagged tweets in Twitter can be used to identify useful user features and behavior as well as identify landmarks/places of interests. We also analysed several clustering algorithms and proposed different similarity measures to detect communities.
The document describes using Apache Flume to collect log data from machines in a manufacturing process. It proposes setting up Flume agents on each machine that generate log files and forwarding the data to a central HDFS server. The author tests a sample Flume configuration with two virtual machines generating logs and an agent transferring the data to an HDFS directory. Next steps discussed are analyzing the log data using tools like MapReduce, Hive, and Mahout and visualizing it to improve quality control and production processes.
Analyse Tweets using Flume, Hadoop and HiveIMC Institute
This document outlines the steps to analyze tweets using Apache Flume, Hadoop, and Hive. It describes how to install and configure Flume to stream Twitter data to HDFS. It also provides instructions for analyzing the Twitter data stored in HDFS using Hive, including registering a JSON SerDe jar file and running sample queries. The goal is to find the Twitter user with the most followers from the streamed data.
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng Richard Sheng
Based on an Analytics Week article of the Top 200 Influencers in Big Data and Analytics, I used R and Hadoop to analyze the Twitter Feeds of these leaders with Text Mining, Web Scraping and Visualization techniques.
The document provides an introduction to the concepts of big data and how it can be analyzed. It discusses how traditional tools cannot handle large data files exceeding gigabytes in size. It then introduces the concepts of distributed computing using MapReduce and the Hadoop framework. Hadoop makes it possible to easily store and process very large datasets across a cluster of commodity servers. It also discusses programming interfaces like Hive and Pig that simplify writing MapReduce programs without needing to use Java.
Social media mining extracts information from social media sources like Facebook, Twitter, and YouTube to understand phenomena and improve services. It addresses challenges from vast, noisy, distributed, unstructured, and dynamic social media data. Common data mining tools and techniques are used to analyze social media data for applications like personalization, targeted marketing, community analysis, and sentiment analysis. Research issues include privacy and developing methods to effectively handle large-scale social media data.
Hadoop, Pig, and Twitter (NoSQL East 2009)Kevin Weil
A talk on the use of Hadoop and Pig inside Twitter, focusing on the flexibility and simplicity of Pig, and the benefits of that for solving real-world big data problems.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
The document discusses verification and validation processes for BIM deliverables. It provides definitions for verification and validation from the ISO 9000 standard. Verification confirms that specified requirements have been fulfilled, while validation confirms requirements for intended use have been fulfilled. The document emphasizes that both verification and validation require objective evidence. It also contains discussions of assessing BIM model and data compliance against requirements in documents like the EIR, COBie and BIM execution plans.
The document summarizes a student project to develop a web application for providing stock market tips. It outlines the various phases of project development including preliminary investigation, analysis, design, coding, and testing. The application allows clients to view category-wise stock tips, upgrade their account, and post comments on tips. The goal is to provide investors with information to help inform their decisions.
The document discusses four database design strategies: top-down design, bottom-up design, centralized design, and decentralized design. It provides examples of applying each strategy to a case study about designing a database for a barangay issuing certification system. For top-down design, it identifies entities and attributes from the business rules. For bottom-up design, it defines attributes first before grouping them into entities. For centralized design, it forms entities based on draft GUI elements. For decentralized design, it defines modules that different groups can work on independently.
Final Year Project BCA Presentation on Pic-O-SticaSharath Raj
This slide is based on the final year project of BCA. Project was on Online image purchase and Sales System.
The system was developed using PHP at the frontend and Mysql at the Backend.
Image will be uploaded and will be watermarked. USer can buy or sell their lovely images.
This document provides an overview of a job portal project, including:
- The system allows job seekers to create profiles, apply for jobs, and search for vacancies. Employers can post jobs and search candidate profiles.
- The analysis, design, implementation, and testing phases are described. The system was developed using ASP.NET, C#, SQL Server, and the RAD model.
- Data dictionaries, use case diagrams, activity diagrams, and screenshots of the admin, job seeker, and job provider interfaces are included.
- Limitations include increased database size over time and the focus only being on IT sector jobs currently. Future improvements could address security testing and expanding to other sectors.
Here's our slide deck from the Dynamics GP Usergroup on December 3rd, 2013. What's new in Dynamics GP 2013 SP2, The Power of Business Intelligence, and Dynamics GP Tips, Tricks and Traps.
This document summarizes research on using Twitter data from the Czech Republic for data mining purposes. It finds that there are approximately 10,000 active Twitter users in the Czech Republic, with 44% of tweets in Czech and 33% in English. Location data shows the majority of tweets come from Prague, Brno, and Ostrava. Different data mining methods like frequency analysis and semantic similarity are used to analyze topics and identify opinion leaders. The document also finds that Czechs tweet most on Tuesdays and Thursdays, and that Twitter can sometimes predict future search trends based on concerts and events more quickly than Google.
Social media data for Social science researchDavide Bennato
This is the talk I gave at the Lipari Summer School on Computational social science 2013. What are relationship between social science and big data? With a focus on Twitter and its social media mining tools
http://www.tecnoetica.it/2013/08/07/lipari-summer-school-computational-social-science-big-data-e-twitter/
This seminar report discusses mining big data from Twitter using a Twitter analyzer tool. The report outlines the current status of big data analysis, which is challenging with normal data mining tools. The report proposes developing a Twitter analyzer application to more efficiently analyze large amounts of Twitter data. The application would connect to a database and allow users to view influence analysis and results to determine the effectiveness of social media strategies. Future enhancements could expand the tool to predict outcomes beyond box office revenues.
With the tremendous growth of social networks, there has been a growth in the amount of new data that is being created every minute on these networking sites. The notion of community in this social networking world has caught lots of attention. Studying Twitter is useful for understanding how people use new communication technologies to form social connections and maintain existing ones. We analysed how geo-tagged tweets in Twitter can be used to identify useful user features and behavior as well as identify landmarks/places of interests. We also analysed several clustering algorithms and proposed different similarity measures to detect communities.
The document describes using Apache Flume to collect log data from machines in a manufacturing process. It proposes setting up Flume agents on each machine that generate log files and forwarding the data to a central HDFS server. The author tests a sample Flume configuration with two virtual machines generating logs and an agent transferring the data to an HDFS directory. Next steps discussed are analyzing the log data using tools like MapReduce, Hive, and Mahout and visualizing it to improve quality control and production processes.
Analyse Tweets using Flume, Hadoop and HiveIMC Institute
This document outlines the steps to analyze tweets using Apache Flume, Hadoop, and Hive. It describes how to install and configure Flume to stream Twitter data to HDFS. It also provides instructions for analyzing the Twitter data stored in HDFS using Hive, including registering a JSON SerDe jar file and running sample queries. The goal is to find the Twitter user with the most followers from the streamed data.
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng Richard Sheng
Based on an Analytics Week article of the Top 200 Influencers in Big Data and Analytics, I used R and Hadoop to analyze the Twitter Feeds of these leaders with Text Mining, Web Scraping and Visualization techniques.
The document provides an introduction to the concepts of big data and how it can be analyzed. It discusses how traditional tools cannot handle large data files exceeding gigabytes in size. It then introduces the concepts of distributed computing using MapReduce and the Hadoop framework. Hadoop makes it possible to easily store and process very large datasets across a cluster of commodity servers. It also discusses programming interfaces like Hive and Pig that simplify writing MapReduce programs without needing to use Java.
Social media mining extracts information from social media sources like Facebook, Twitter, and YouTube to understand phenomena and improve services. It addresses challenges from vast, noisy, distributed, unstructured, and dynamic social media data. Common data mining tools and techniques are used to analyze social media data for applications like personalization, targeted marketing, community analysis, and sentiment analysis. Research issues include privacy and developing methods to effectively handle large-scale social media data.
Hadoop, Pig, and Twitter (NoSQL East 2009)Kevin Weil
A talk on the use of Hadoop and Pig inside Twitter, focusing on the flexibility and simplicity of Pig, and the benefits of that for solving real-world big data problems.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
The document discusses verification and validation processes for BIM deliverables. It provides definitions for verification and validation from the ISO 9000 standard. Verification confirms that specified requirements have been fulfilled, while validation confirms requirements for intended use have been fulfilled. The document emphasizes that both verification and validation require objective evidence. It also contains discussions of assessing BIM model and data compliance against requirements in documents like the EIR, COBie and BIM execution plans.
The document summarizes a student project to develop a web application for providing stock market tips. It outlines the various phases of project development including preliminary investigation, analysis, design, coding, and testing. The application allows clients to view category-wise stock tips, upgrade their account, and post comments on tips. The goal is to provide investors with information to help inform their decisions.
The document discusses four database design strategies: top-down design, bottom-up design, centralized design, and decentralized design. It provides examples of applying each strategy to a case study about designing a database for a barangay issuing certification system. For top-down design, it identifies entities and attributes from the business rules. For bottom-up design, it defines attributes first before grouping them into entities. For centralized design, it forms entities based on draft GUI elements. For decentralized design, it defines modules that different groups can work on independently.
Final Year Project BCA Presentation on Pic-O-SticaSharath Raj
This slide is based on the final year project of BCA. Project was on Online image purchase and Sales System.
The system was developed using PHP at the frontend and Mysql at the Backend.
Image will be uploaded and will be watermarked. USer can buy or sell their lovely images.
This document provides an overview of a job portal project, including:
- The system allows job seekers to create profiles, apply for jobs, and search for vacancies. Employers can post jobs and search candidate profiles.
- The analysis, design, implementation, and testing phases are described. The system was developed using ASP.NET, C#, SQL Server, and the RAD model.
- Data dictionaries, use case diagrams, activity diagrams, and screenshots of the admin, job seeker, and job provider interfaces are included.
- Limitations include increased database size over time and the focus only being on IT sector jobs currently. Future improvements could address security testing and expanding to other sectors.
Here's our slide deck from the Dynamics GP Usergroup on December 3rd, 2013. What's new in Dynamics GP 2013 SP2, The Power of Business Intelligence, and Dynamics GP Tips, Tricks and Traps.
M. Yasir Ghauri has over 15 years of experience in .NET development. He has a B.S. in Computer Engineering and is Microsoft Certified Professional in SQL Server Installation and Setup. Currently he is the Head of .NET Development at Evincible Solutions, where he is responsible for the successful planning and execution of projects. He has strong skills in C#, ASP.NET, Microsoft SQL Server, and SharePoint. He has worked on several projects integrating websites and applications with Microsoft Dynamics GP and CRM.
This document describes a mobile app called the Service Provider App that will connect consumers to service providers. The app was developed by Finava Vipul and Devani Jatin for AtoZ INFOWAY. It will allow consumers to find nearby service providers using GPS and allow communication between consumers and providers. The app has modules for admins, consumers, service providers, and visitors. It uses Android and Cordova for the front end and ASP.NET for the back end.
Alister Toma - Vancouver Residential Real Estate DevelopersAlister Toma
This real estate portal project allows private sellers, buyers, and agents to list and search for properties for sale or rent. Users can manage listings and view featured ads. The system includes functionality for real estate agents to list multiple properties and earn commissions. It also has a powerful back office for administrators to manage the website and portal, including user accounts, listings, statistics, and more. The system defines two types of users - free users who can search but have limited access, and paid subscribers who can access all features. It outlines the various system processes, data models, and flow diagrams to design and develop the real estate management application.
This document provides a functional specification for a vendor ageing report requested by Chinna Hussain Saheb.D. The report aims to display open payment items due from vendors according to ageing categories like 15, 30, 45, 60, 90, 180 and 360 days. It will retrieve required fields from SAP tables BSIK and FAGLFLEXA to show vendor account, company code, posting date, document number, amount, and other details for normal and special general ledger items by ageing period. The report output will be in an Excel file format for management review before processing vendor payments.
I am interested and applying for a position for Senior Software Developer
and I have over 5.8 Years exp in (.Net) with C# Application with sql server and Oracle database
This document is a resume for Abhishek Kumar who has over 1.4 years of experience as a junior ABAP consultant. He has skills in ABAP programming, reports, smartforms, and various SAP modules. His experience includes projects developing reports, smartforms, batch data transfers, and custom screens for companies in industries like chemicals and construction equipment manufacturing. He holds an M.Sc. in Electronics and B.Sc. in Electronics and has additional SAP training.
This document provides an introduction to the Salesforce Mobile SDK for Android. It discusses how mobile device usage is growing significantly, with many users relying on multiple devices for both work and personal use. The Mobile SDK provides tools to accelerate native, hybrid, and HTML5 mobile app development on Android and iOS platforms. These tools include OAuth authentication, API wrappers, an app container, secure offline storage, and push notifications. A five minute example is provided showing how to use the Mobile SDK to create a basic Android app that integrates with the Salesforce platform through REST calls.
This document provides an overview of database requirements, design, and development. It discusses the steps in the systems development life cycle as they relate to database projects, including project identification and selection, requirements analysis, logical design, physical design, implementation, and maintenance. Examples of conceptual data modeling and entity relationship diagrams are also presented to illustrate how requirements can be modeled visually.
This document describes a proposed real estate website project. The project aims to create a website that allows users to buy and sell properties online. Key features would include user registration and login, property listing and search functionality, and the ability to display property details and photos. The system is intended to reduce time and effort compared to a manual real estate process. Technical details like database structure, programming tools, and screen designs are provided.
2. Project Details:
Project Title : The Asset Consultancy
Project ID: 46414
Group Size: 2
Name Of Developers : Janki Kansara(120170107024)
Rushin Naik (120170107046)
Project Type: IDP
Company Name: Sculptsoft
3. 1) Abstract
2) Technology Used
3) System Architecture
4) Current Scenario
5) New Flow
6) Project Need
7) Core Features and Benefits
8) What’s New
9) Diagrams
10)Implementation of Big Data
11) Future Scope
12) Conclusion
13) Bibliography/References
Index:
4. Abstract:
The Asset Consultancy involves the analysis/prediction of current
and future price of real-estate commercials and buildings. The goal is to
conduct that depending on analysis/prediction made for the prices of
property, whether the customer should invest his money on that property or
not. Upon examination of the results, it can be inferred if investing money on
the estate would bring out a positive outcome or a negative one, and hence
makes the customer aware about the advantages/flaws of investing at the
particular place. Data will be fetched from different pre-requisites and
compiled at one place producing analysis by an algorithm using Hadoop
application and its different internal structures, which would be implemented
in Java language.
5. Technology Used:
Software Used
Operating System: Windows 7 32 bit Professional, Linux Ubuntu
Web Server: Apache Tomcat Server 8.0.15.0
Server side Application Software: NetBeans IDE 7.0.1 or 8.0.2
Languages: Java Script, HTML/CSS, Java, JSP
Data Base: SQL Server Management Studio
Client Browsers: Google Chrome Or Firefox
Java Software : JDK 1.8.0_66
Hadoop 2.7.1
Hive 1.2.1
Sqoop 1.4.6
Netbeans IDE 8.0.2
MYSQL Workbench 6.2
Linux OS/Linux OS environment on windows
11. Project Need:
• To get authentic data about the upcoming trend from the past years
pricing of real estate.
• To improve the agent-customer connectivity.
• To get an idea about the scenario at personal level rather than just
following what the agent says.
• To save money making decisions by yourself.
• To make investment more precisely and smartly.
• Improve awareness about the market trend.
• To get approximate value of one’s own property after few years.
12. Core Feature and Benefits:
• To provide the latest data of the price fluctuations taking place
in real estates by referring to the past years data.
• It also provides facility of creating groups to interact with other
customers and agents.
• The data for the next few years can be analyzed and
summarized, which will give a better idea to the customer. The
property can be recognized more deeply with better prospects.
• Authenticated data- All at One place.
13. What’s new?
• Yearly prediction of real estate pricing from the past and current scenario.
This new application provides the facility for prediction of the prices of the
assets and give an idea about the trend of the value of properties.
• Users can create group and discuss about different properties with agents.
The agent can give notifications to multiple customers and also other agents
through the group feature. This facility promotes customer-agent interaction.
• There other features worth mentioning like the user can get alerts about the
latest updates of the particular property, easy and user friendly search of
multiple properties and simple interface.
• The system not only allows to view the graphical form of the trend, but also
allows the user to save the report of the analysis and prediction to the disc.
21. Register
Login
System Interface
Search Property
Create Group
Select Property
View Description
Stalk Property Compare Property View Analysis & Prediction
Show Result
Logged-in Successfully
New User
Registered Successfully
Go to Groups
Give Property Specification
Get Property Notifications
Comparing mulitple properties
Select Chart Type
Invalid Entry
Invalid Entry
Logout
Invite Member
Read Post
Discuss Property
Group Verified and Created
Wait for notification
View FeedBack
Go to specific Property or Broker
No such property found
Select Group
State Transition
Diagram:
22. BrokerAdministratorCustomer
Registration
Login
Invalid Entry
View Property Details
Invalid Entry
Registration
Make Group
Group Created
Manage Group Activities
Give/View Feedback
View Analysis/Prediction
Logout
Login
Broker Profile Verification
Manage Group Request
Group Verified
Broker Verified
Logout
Login
Make Group
Manage Group Activities
Group Created
Logout
Update Property DetailsVerify Property Update
View Feedback
View Property Details
View Feedback
Invalid Entry
Invalid Entry
Invalid Entry
Activity Diagram:
27. Data Dictionary:
Field Data Type Constraints Description
P_id Int Primary key Unique identity of property.
P_name Varchar(50) Not null Name of property.
P_location Varchar(200) Not null Address of the property.
P_size Int Not null Area of property in sq. meter.
P_type Varchar(50) Not null Features of property.
Associated_agent Varchar(50) Allow null Agent handling the property.
Description Varchar(500) Not null Summary about the property.
Property:
28. Field Data Type Constraints Description
C_id Int Primary key Unique identity of customer.
Log_id Int Foreign_id Login id.
C_name Varchar(50) Not null Name of customer.
C_gender Varchar(10) Not null Gender of customer.
C_contactno Varchar(50) Not null Contact number of the customer.
C_email Varchar(50) Not null Email id of the customer.
G_id int Allow null The id of group the customer has joined
Field Data Type Constraints Description
A_id Int Primary key Unique identity of agent.
Log_id Int Foreign key Login id.
A_name Varchar(50) Not null Name of agent.
A_contactno Varchar(50) Not null Contact number of agent.
A_address Varchar(50) Not null Address of agent.
A_email Varchar(50) Not null Email id of agent.
G_id int Allow null Id of the group the agent has joined.
Customer:
Agent:
29. Field Data Type Constraints Description
G_id Int Primary key Unique identity of the group.
G_name Varchar(50) Not null Name of group.
G_admin Varchar(50) Not null Administrator of the group.
No_of_members Int Not null Number of members in the group.
Log_id int Not null The login id of the members in the group.
G_date Date Not null Date on which the group was created.
Field Data Type Constraints Description
F_id Int Primary Key Unique identity of feedback.
C_id Int Foreign Key Customer who gave the feedback.
F_type Varchar(50) Not null Property feedback or agent feedback.
Subject_id int Not null Agent id or Property id about which the feedback
is posted.
Group:
Feedback:
30. Field Data Type Constraints Description
R_id Int Primary Key Unique identity of analysis and prediction
report.
P_id Int Foreign Key Property about which the report is generated.
User_id Int Foreign Key The user who issued the report.
R_date Date Not null Date on which the report was generated.
R_path Varchar(100) Not null Path for the reports.
Field Data Type Constraints Description
Log_id int Primary key Unique identity used for login.
Username Varchar(50) Not null Username of the user.
Password Varchar(20) Not null Password for login
User_type Varchar(10) Not null Type of user: Customer, Agent, Admin
Report:
Login:
31. Implementation of Big Data
Big Data is used to reduce the large amount of data
existing in the database.
This data(mostly existing in the row format) is then
processed inside hive and sqoop to synchronize the
data.
Then after, the whole data fetched from the database is
reduced using MapReduce Technique of Hadoop.
Thousands of rows are shorten up to 50-100 rows and
hence we get a reduced format of the infinite rows.
Hence these rows will produce the data required by the
program and will present them in chart format.
32.
33. The chart format usually is shown in the following
manner:
34. Sqoop Functionality
Sqoop is used in our project to transfer data between
Hadoop and relational databases. The data is imported
from MySQL into the Hadoop Distributed File System
(HDFS), transformed in Hadoop MapReduce, and
then exported back into an RDBMS.
Sqoop automates most of this process, relying on the
database to describe the schema for the data to be
imported. Sqoop uses MapReduce to import and
export the data, which provides parallel operation as
well as fault tolerance.
36. Hive Functionality
Hive defines a simple SQL-like query language, called QL,
that enables users familiar with SQL to query the data. At
the same time, this language also allows programmers who
are familiar with the MapReduce framework to be able to
plug in their custom mappers and reducers to perform
more sophisticated analysis that may not be supported by
the built-in capabilities of the language. QL can also be
extended with custom scalar functions (UDF's),
aggregations (UDAF's), and table functions (UDTF's).
Hive functionality is basically used in our project to process
the fetched tables from the database and create new tables
in the hadoop database.
38. Criteria Fetch
We have set a particular criteria for the function to
fetch the data from the database table just as below:
This query is used to fetch the particular data from the
database of MYSQL and hence the processing of the
function takes place according to this criteria that has
been set.
The query is written and run externally by hadoop and
after it has been completely processed, the script is
called from the database of the tables.
40. Script Call
As soon as the criteria is fetched, the relevant script is
called from the database table which is used to
implement in the servlet and java script pages.
The script that is fetched, contains all the information
which is required by the database to implement sqoop
and hive functions.
Script is basically the Query that is fetched to
implement in the Hadoop.
43. Future Scope
The new features related to personal advice from the adviser can be added to the
system. Customer can personally ask questions to the advisers and get the doubts
cleared.
More precise evaluation of parameters can be done to provide accurate analysis. Better
algorithms can be designed to get better results about the prediction of prices.
A feature of audio searching with voice recognition can also be implemented to make
system better and make it easy to go for the users.
Furthermore, property area ranking can also be shown twice in every month. Such
feature will create a better idea in mind of the customer about which area is to be
selected for investment.
Features that include Stalk Property and Comparison of two or more properties can be
added to make the customers aware about which property is best out of both and to
keep track of the properties they like.
44. Conclusion
In conclusion, it can be foreseen that if the project is implemented, it is
going to be of much help the people who would like to analyze the market
of real estate by oneself. The features would avail the user to know the
trend and thus come to a decision about the property statistics. Though
many other systems would be competing this system, but once the
limitations are eliminated or the future enhancements are implemented it
would be one of the best in the segment of asset market.
45. Bibliography/References
Books:
Object oriented Analysis and Design
using UML by
Michel R. Blaha and James R.
Rambaugh
The Complete Reference, Java 2
(Fourth Edition), Herbert Schild,
TMH.
Websites:
• www.magicbricks.com
• www.99acres.com
• www.propchill.com
• www.google.com
• www.tutorialspoint.com