# Data Mining Techniques Using R and WEKA

This Term paper explained two Techniques -
1) Linear Modelling using R
2) Clustering using WEKA

### Data Mining Techniques Using R and WEKA

1. 1. Data Mining Techniques Using R and WEKA IT for Business Intelligence Term paper Utsav Mone (10BM60094)This Term paper explained two Techniques - 1) Linear Modelling using R 2) Clustering using WEKA
3. 3. Data –Dr Devlina Chatterjee of VGSoM has purchased lots of data from NSE for her research. I have usedfew files from her data.There are three types of files. 1) Snapshots 2) Trade Data 3) Price Volume DataPrice Volume DataI have used February 2008 share data of Tata Motors. Except the traded data rest all data is availablein public domain.The file contains the following items i) Symbol, ii) Series, iii) Date, iv) Prev Close, v) Open Price, vi) High Price, vii) Low Price, viii) Last Price, ix) Close Price, x) Average Price, xi) Total Traded xii) Quantity, xiii) Turnover in Lacs,This text file is available at this link- http://bit.ly/TM_PVDTATAMOTORS,EQ,03-Dec-2007,732.45,736,749,733.35,737,736.15,741,481721,3569.5399TATAMOTORS,EQ,04-Dec-2007,736.15,737,746,728.35,746,741.3,738.2,631272,4660.0808995,TATAMOTORS,EQ,05-Dec-2007,741.3,744,783.9,744,773,772.4,769.92,1410714,10861.311993,TATAMOTORS,EQ,06-Dec-2007,772.4,775.5,782,763.25,778,775.45,774.13,807793,6253.379844,TATAMOTORS,EQ,10-Dec-2007,767.3,772,777.7,745.05,775,766.45,757.78,521361,3950.7440285,TATAMOTORS,EQ,11-Dec-2007,766.45,770,777.3,761,777.3,775.2,770.04,676097,5206.1990345,TATAMOTORS,EQ,12-Dec-2007,775.2,776.9,780,762,769,770.05,768.88,665743,5118.7625105,
4. 4. Snapshots DataIn this type of data we have snapshot of order book for 4 Hours in a day which are 11Hr, 12Hr, 13 Hr,14Hr. Here we see snapshot data of Tata Motor for different months and hours of the day.Here is a look of the data. Since numbers of files are too much it is difficult to upload it.A look at Snapshot data – 1) Order Number 2) Company 3) Trade Type 4) No of shares in Order 5) Quote 6) Time Stamp 7) Buy Sell 8) FlagsA Sample Snapshot Data of Tata Motors on 1 Feb 11 Hr -2008020150046719 TATAMOTORS EQ 500 559.60 09:55:48 B ynnn nnn nnn RL 02008020150716321 TATAMOTORS EQ 10 560.00 10:35:56 B ynnn nnn nnn RL 02008020150034116 TATAMOTORS EQ 100 575.00 09:55:22 B ynnn nnn nnn RL 02008020150067971 TATAMOTORS EQ 824 576.65 09:56:38 B ynnn nny nnn RL 02008020100283272 TATAMOTORS EQ 100 582.00 10:09:10 B ynnn nnn nnn RL 02008020150233325 TATAMOTORS EQ 25000 585.00 10:04:34 B ynnn nny nnn RL 0Detail of Flags can be seen at –https://docs.google.com/document/d/1pW0Fou2VzSiacEn0OKR5rKeKBRZTzw5USk7HOBVQRjs/edit
5. 5. Trade DataThis is a daily trade data. Which gives all the trades took place in a day.A look at Trade data – 1) Trade Number 2) Name of Company 3) Type of Trade 4) Time of Trading 5) Price 6) Volume of shares tradedOpening data Price = 7082475593 TATAMOTORS EQ 09:55:16 708 37132475830 TATAMOTORS EQ 09:55:20 708 8002475871 TATAMOTORS EQ 09:55:21 708 2002475872 TATAMOTORS EQ 09:55:21 708 12475873 TATAMOTORS EQ 09:55:21 708 12475874 TATAMOTORS EQ 09:55:21 708 2102475935 TATAMOTORS EQ 09:55:22 708 800See Price variation in 3 Seconds from 755 to back 7553843007 TATAMOTORS EQ 13:33:37 755 53843008 TATAMOTORS EQ 13:33:37 755 4533843021 TATAMOTORS EQ 13:33:38 754.9 13843022 TATAMOTORS EQ 13:33:38 754.55 93843037 TATAMOTORS EQ 13:33:38 755 13843050 TATAMOTORS EQ 13:33:38 754.9 13843051 TATAMOTORS EQ 13:33:38 754.9 93843052 TATAMOTORS EQ 13:33:38 754.9 13843069 TATAMOTORS EQ 13:33:39 755 1More detail of the data is available at –https://docs.google.com/document/d/1pW0Fou2VzSiacEn0OKR5rKeKBRZTzw5USk7HOBVQRjs/edit
6. 6. R ProgramData LocationWe need to set Directory location in R.R looks for all the file in the directory assigned.Packages RequirementCHRONZOOFDAMASSPROTODBIRSQL.LITERSQL.EXTFUNCSSTATS4SDETCLTKSQLDF