Published on

This Project deals with the detection of fraud in online Shopping.

Published in: Internet, Technology
1 Comment
  • i need this slides how can i download it???
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  3. 3. 1 1. INTRODUCTION Since the emergence of the World Wide Web (WWW), electronic commerce, commonly known as e-commerce, has become more and more popular. Websites such as eBay and Amazon allow Internet users to buy and sell products and services online, which benefits everyone in terms of convenience and profitability. The traditional online shopping business model allows sellers to sell a product or service at a preset price, where buyers can choose to purchase if they find it to be a good deal. Online auction however is a different business model by which items are sold through price bidding. There is often a starting price and expiration time specified by the sellers. Once the auction starts, potential buyers bid against each other, and the winner gets the item with their highest winning bid. Similar to any platform supporting financial transactions, online auction attracts criminals to commit fraud. The varying types of auction fraud are as follows. Products purchased by the buyer are not delivered by the seller. The delivered products do not match the descriptions that were posted by sellers. Malicious sellers may even post non-existing items with false description to deceive buyers, and request payments to be wired directly to them via bank-to-bank wire transfer. Furthermore, some criminals apply phishing techniques to steal high- rated seller’s accounts so that potential buyers can be easily deceived due to their good rating. Victims of fraud transactions usually lose their money and in most cases are not recoverable. As a result, the reputation of the online auction services is hurt significantly due to fraud crime.
  4. 4. 2 2. PROJECT SPECIFICATIONS 2.1 Software Requirement Specification Operating System : Windows95/98/2000/XP Application Server : Tomcat5.0/6.X Front End : HTML, Java, JSP Scripts : JavaScript. Server side Script : Java Server Pages. Database : MySql Database Connectivity : JDBC. 2.2 Hardware Requirement Specification Speed : 1.1 Ghz RAM : 256 MB(min) Hard Disk : 20 GB Floppy Drive : 1.44 MB Key Board : Standard Windows Keyboard Mouse : Two or Three Button Mouse Monitor : SVGA
  5. 5. 3 3. LITERATURE SURVEY Literature survey is the most important step in software development process. Before developing the tool it is necessary to determine the time factor, economy n company strength. Once these things r satisfied, ten next steps are to determine which operating system and language can be used for developing the tool. Once the programmers start building the tool the programmers need lot of external support. This support can be obtained from senior programmers, from book or from websites. Before building the system the above consideration r taken into account for developing the proposed system. 3.1.Java The term Java actual refers to more than just a particular language like C or Pascal. Java encompasses several parts, including: A high level language – the Java language is a high level one that at a glance looks very similar to C and C++ but offers many unique features of its own. Java byte code - a compiler, such as Sun's javac, transforms the Java language source code to byte code that runs in the JVM. Java Virtual Machine (JVM) – a program, such as Sun's java, that runs on a given platform and takes the byte code programs as input and interprets them just as if it were a physical processor executing machine code. Sun provides a set of programming tools such as javac, java and others in a bundle that it calls a Java Software Development Kit for each version of the language and for different platforms such as Windows, Linux, etc. Sun also provides a runtime bundle with just the JVM when the programming tools are not needed. Note that because of the open nature of Java, any or all of these parts can be replaced by non-Sun components. For example, just as many different languages can create machine code for a given processor, compilers of other languages have been created that output byte code to run in the JVM. Similarly, many JVMs have been written by groups outside of Sun.
  6. 6. 4 Java is not quite an open language but not quite a proprietary one either. All the core language products - compiler, virtual machines (VM), class packages, and other components - are free. Detailed specifications and source code are made openly available. The Java Community Process (JCP) leads the development of new standards for the language. Other companies and organizations can legally create a clean sheet compiler and/or a Virtual Machine as long as it follows the publicly available specifications. Microsoft did this with the Version 1.1 JVM that it used in its Internet Explorer browser. Sun, however, does still assert final say on the specifications and controls the copyrights to logos, and trade marks. For example, Microsoft's VM differed in some significant details from the specifications and Sun accused Microsoft of attempting to weaken Java's "write once, run anywhere" capabilities. Sun sued Microsoft and the dispute was later settled out of court. Comparison with C++: Java has eliminated several features of C++ as listed here:  No Pointers  No Implicit Type Casting  No Structures or Unions  No Operator Overloading  No Templates  No Header Files No Pointers: Java does not support any pointer arithmetic. As the improper use of pointers may lead to a system crash, elimination of pointers makes Java applications more robust.
  7. 7. 5 No Implicit Type Casting Java does not support implicit type casting except for promotions. Automatic promotions are permitted. Any demotion must be explicitly typecast. Thus, you may assign an int data type to a float data type without the use of explicit typecast; however, to convert a float to an int, an explicit type cast will be required. No Structures and Unions Like C++, Java does not support structures and unions. Thus, everything must be defined in terms of classes. No Operator Overloading The operator overloading though a useful feature is rarely used in the practice due to the complexity involved in the coding. Java does not support user defined operator overloading. The plus (+) operator for Java does not support user defined operator overloading. String concatenation is overloaded internally. No Templates Templates are generally used for defining mathematical libraries, etc. Java does not support the concept of templates. Note that the latest version of Java supports templates with the help of newly-added feature called generics. No Header Files Headers files are required for declaring global variables and function prototypes. Java does not support declaration of global variables. The method signatures can be generated during the first pass of the compiler. Thus, Java does not use concept of header files. No Multiple Inheritances The multiple inheritances can lead to diamond-shaped inheritance problems. C++ solves this problem by using virtual keyword. Java does not support multiple inheritances.
  8. 8. 6 Advantages of JAVA Here we list some of the major benefits that Java can provide for general science and engineering applications: PlatformIndependence Scientists use more types of computers and OS's that most other groups. Code that can be exchanged without requiring rewrites and recompilation saves time and effort. Object-Oriented Besides the usual benefits from OOP, many scientific programs can benefit from thinking in terms of objects. For example, particles in a scattering simulation are naturally self-contained objects. Threading Multi-processing is very useful for many scientific tasks, such as, for example, simulations of phenomena where many processes occur simultaneously. Networking Java comes with many networking capabilities that allow one to build distributed systems. Such capabilities can be applied, for example, to remote data taking from sensors. Embedded Applications The original Oak language from which Java derived was intended for embedded applications. Platform independence and the other items mentioned above, as well as the adaptability of Java that allows it to work on micro-sized platforms by shedding nonessential code, has made Java very popular for use in embedded devices such as smart cards and cell phones. It can thus also be embedded into sensors, controllers, and other types of engineering and scientific devices. The Java programming language is a high-level language that can be characterized by all of the following buzzwords:  Simple  Architecture neutral  Object oriented  Portable
  9. 9. 7  Distributed  High performance  Interpreted  Multithreaded  Robust  Dynamic  Secure Java Run time Environment The Java Runtime Environment (JRE), also known as Java Runtime, is part of the Java Development Kit (JDK), a set of programming tools for developing Java applic- ations. The Java Runtime Environment provides the minimum requirements for exec- uting aJava Application; it consists of the Java Virtual Machine (JVM), core classes, and supporting files. Fig. Program conversion JVM provides a secured runtime environment for running Java applications. The byte code verifier within JVM checks the validity of each byte code before it is in submitted for execution to the CPU. JVM also checks for illegal memory access. At runtime, application is allocated a memory space. An attempt to access amem ory location this space is trapped by JVM. Java also does not support pointer the arithmetic. The prevention of illegal memory access and no support for pointer in arithmetic makes it impossible to introduce a virus, which is a malicious code is
  10. 10. 8 Small Java is small; the compiled byte code is usually very small. A typical Hello World appli cation byte code is typically few hundreds of bytes. The minimal runtime environment r equired to run this code is usually less than 1 MB. Interpretation Vs Compilation Java incorporates elements of both interpretation and compilation. Here is more information on these two approaches: Interpretation An interpreter reads in one line of a program and executes it before going to the next line. The line is first parsed to its smallest component operations and then each operation executed. (This could be done with something like the switch statement in C with every possible operation case listed.) The interpreter is normally a program compiled into the local machine code so its operations run at full speed. BASIC was one of the earliest interpreted languages and each text line is interpreted. Similarly, scripting languages like PERL are also interpreted. Interpretation simplifies the programming environment since there are no intermediate steps between writing or modifying the code and running it. Results are known immediately, so debugging is fast. Also, the programs are easily transportable to other platforms (if an interpreter is available on them.) The drawback is slow performance. The interpreter must read a line, translate it and find the corresponding machine level code, and then execute the instructions. Compilation The program text file is first converted to native machine code with a program called a compiler. (A linker program may also be necessary to connect together multiple program code files.) The output file of the compiler is the executable program that runs. FORTRAN, C/C++, and Pascal are all compiled languages.
  11. 11. 9 The biggest advantage of compiled language is the fast performance since the machine code instructions load directly into the processor and execute. In addition, the compiler can optimize the program since it looks at the whole program at once rather than simply line by line as with the interpreter. The disadvantages include slower debugging since after every correction and modification, the program must be recompiled. Also, since the executable is in local machine code, the executable files are not usually transportable to other platforms. The source code must be recompiled on those machines. The Java Approach Java incorporates both interpretation and compilation. The text program is compiled to the "machine" code, called byte codes, for the Java Virtual Machine (JVM or just VM). The JVM simulates a processor that executes the byte code instructions. The JVM interprets the byte codes. The byte codes can be run on any platform on which a JVM has been developed. The program runs "inside" the JVM so it doesn't care what platform it is on. Thus, Java attempts to get the best of both worlds. The compilation step allows for some degree of optimization of the code and the JVM interpretation allows for portability. There remains the drawback of an extra compilation step after every correction during debugging. Also, the interpretation of byte codes is still slower in many cases than a program in local machine code. Advanced JVM's can ameliorate this, however, and in many cases now reach speeds similar to programs compiled to local machine code.
  12. 12. 10 Java Virtual Machine Every Java code runs under a JVM. A JVM is a piece of software that runs on the client machine. A Java application may be deployed on a web server and served to the client machines as an Applet. Each client machine may run a different operating system; however, each machine is required to run a JVM of its own. The byte code supplied by the web server runs on each of the client platforms without any modifications. Thus, the compiled Java code is platform-neutral. JVM Design Unlike many hardware processors, the JVM does not allow access to registers that hold program counters, operands, etc. Instead it uses operand stacks and local variables. Every time a method is called, or invoked, a new stack (Last-In-First-Out memory) is created to hold operand values for instructions and to receive results from an instruction operation. Method argument values are passed via the stack and the method return value is passed via the stack. The stack values are 32-bit. The iconst_2 instruction in the above program puts the integer value 2 on top of the stack. Note: This is an example of where knowing something about the JVM helps explain an important aspect of the Java language. Note that double and long values, which are 64 bits, require two of the 32 bit wide slots on the stack. This requires the JVM to carry out two stack operations to place or remove such values on the stack. This can cause problems if a process (that is, a thread) is stopped in between these two operations. The data will be left in an indeterminate state. In
  13. 13. 11 fact, the stop () and suspend() and resume() methods in the original Thread class of version 1.1 were deprecated just to avoid this kind of problem. Similarly, memory is allocated for local variables in each method invocation and each variable given a number. In the above example, the variable "i" becomes variable 1. The instruction istore_1 puts the current value at the top of the stack into the local variable 1. There are a number of other features used in the JVM such as a Constants Pool that holds symbolic data for a class. JVM Implementation Although the byte code cannot access registers or directly reference memory locations and must obey various other restrictions, the actual JVM program can use internally whatever techniques are convenient to use for a particular platform. As long as the Java byte code sees only a JVM specification compliant system, the JVM programmer has broad discretion for its implementation. Java was always intended for a wide array of platforms, including very simple embedded processors that might provide few or no registers. So the stack approach was taken to allow for Java to run on such basic hardware. Of course, the JVM program itself will run as normal on a processor with a register architecture. Typically the JVM is written in C (since virtually every platform has a C compiler). The simplest interpreter style approach would involve just a big switch statement In which each instruction would jump to the code in the appropriate case section. Most JVMs employ far more sophisticated approaches so as to optimize the performance of the byte code and achieve C like performance speeds. Java Server Pages (JSP): Java Server Pages (JSP) technology provides a simplified, fast way to create web pages that display dynamically-generated content. The JSP specification, developed through an industry-wide initiative led by Sun Microsystems, defines the interaction between the server and the JSP page, and describes the format and syntax of the page. The focus of Java EE 5 has been ease of development by making use of Java language annotations that were introduced by J2SE 5.0. JSP 2.1 supports this goal by defining annotations for dependency injection on JSP tag handlers and context listeners.
  14. 14. 12 Another key concern of the Java EE 5 specification has been the alignment of its web tier technologies, namely Java Server Pages (JSP), Java Server Faces (JSF), and Java Server Pages Standard Tag Library (JSTL). With most programming languages, you either compile or interpret a program so that you can run it on your computer. The Java programming language is unusual in that a program is both compiled and interpreted. With the compiler, first you translate a program into an intermediate language called Javabyte codes the platform- independent codes interpreted by the interpreter on the Java platform. The interpreter parses and runs each Java byte code instruction on the computer. Compilation happens just once; interpretation occurs each time the program is executed. The following figure illustrates how this works. Fig:3.1.1 compilation of any java program You can think of Java byte codes as the machine code instructions for the JavaVirtualMachine (Java VM). Every Java interpreter, whether it’s a development tool or a Web browser that can run applets, is an implementation of the Java VM. Java byte codes help make “write once, run anywhere” possible. You can compile your program into byte codes on any platform that has a Java compiler. The byte codes can then be run on any implementation of the Java VM. That means that as long as a computer has a Java VM, the same program written in the Java programming language can run on Windows 2000, a Solaris workstation, or on an iMac.
  15. 15. 13 Fig 3.1.2 Java program execution The Java Platform: A platform is the hardware or software environment in which a program runs. We’ve already mentioned some of the most popular platforms like Windows 2000, Linux, Solaris, and MacOS. Most platforms can be described as a combination of the operating system and hardware. The Java platform differs from most other platforms in that it’s a software-only platform that runs on top of other hardware-based platforms. The Java platform has two components:  The Java Virtual Machine (Java VM): The Java Application Programming Interface (Java API) .You’ve already been introduced to the Java VM. It’s the base for the Java platform and is ported onto various hardware-based platforms. The Java API is a large collection of ready-made software components that provide many useful capabilities, such as graphical user interface (GUI) widgets. The Java API is grouped into libraries of related classes and interfaces; these libraries are known as packages. What Can Java Technology Do? Highlights what functionality some of the packages in the Java API provide. The following figure depicts a program that’s running on the Java platform. As the figure shows, the Java API and the virtual machine insulate the program from the hardware.
  16. 16. 14 fig 3.1.3depicts a program that’s running on the Java platform Native code is code that after you compile it, the compiled code runs on a specific hardware platform. As a platform-independent environment, the Java platform can be a bit slower than native code. However, smart compilers, well-tuned interpreters, and just-in-time byte code compilers can bring performance close to that of native code without threatening portability. What Can Java Technology Do? The most common types of programs written in the Java programming language are applets and applications. If you’ve surfed the Web, you’re probably already familiar with applets. An applet is a program that adheres to certain conventions that allow it to run within a Java-enabled browser. However, the Java programming language is not just for writing cute, entertaining applets for the Web. The general-purpose, high-level Java programming language is also a powerful software platform. Using the generous API, you can write many types of programs. An application is a standalone program that runs directly on the Java platform. A special kind of application known as a server serves and supports clients on a network. Examples of servers are Web servers, proxy servers, mail servers, and print servers. Another specialized program is a servlet. A servlet can almost be thought of as an applet that runs on the server side. Java Servlets are a popular choice for building interactive web applications, replacing the use of CGI scripts. Servlets are similar to applets in that they are runtime extensions of applications. Instead of working in browsers, though, servlets run within Java Web servers, configuring or tailoring the server. How does the API support all these kinds of programs? It does so
  17. 17. 15 with packages of software components that provides a wide range of functionality. Every full implementation of the Java platform gives you the following features:  The essentials: Objects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.  Applets: The set of conventions used by applets.  Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram Protocol) sockets, and IP (Internet Protocol) addresses.  Internationalization: Help for writing programs that can be localized for users worldwide. Programs can automatically adapt to specific locales and be displayed in the appropriate language.  Security: Both low level and high level, including electronic signatures, public and private key management, access control, and certificates.  Software components: Known as JavaBeansTM , can plug into existing component architectures.  Object serialization: Allows lightweight persistence and communication via Remote Method Invocation (RMI).  Java Database Connectivity (JDBCTM): Provides uniform access to a wide range of relational databases. The Java platform also has APIs for 2D and 3D graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Java 2 SDK. Fig-java2sdk
  18. 18. 16 How Will Java Technology Change My Life? We can’t promise you fame, fortune, or even a job if you learn the Java programming language. Still, it is likely to make your programs better and requires less effort than other languages. We believe that Java technology will help you do the following:  Get started quickly: Although the Java programming language is a powerful object-oriented language, it’s easy to learn, especially for programmers already familiar with C or C++.  Write less code: Comparisons of program metrics (class counts, method counts, and so on) suggest that a program written in the Java programming language can be four times smaller than the same program in C++.  Write better code: The Java programming language encourages good coding practices, and its garbage collection helps you avoid memory leaks. Its object orientation, its JavaBeans component architecture, and its wide-ranging, easily extendible API let you reuse other people’s tested code and introduce fewer bugs.  Develop programs more quickly: Your development time may be as much as twice as fast versus writing the same program in C++. Why? You write fewer lines of code and it is a simpler programming language than C++. Avoid platform dependencies with 100% Pure Java: You can keep your program portable by avoiding the use of libraries written in other languages. The 100% Pure JavaTM Product Certification Program has a repository of historical process manuals, white papers, brochures, and similar materials online. Write once, run anywhere: Because 100% Pure Java programs are compiled into machine-independent byte codes, they run consistently on any Java platform.
  19. 19. 17 Distribute software more easily: You can upgrade applets easily from a central server. Applets take advantage of the feature of allowing new classes to be loaded “on the fly,” without recompiling the entire program. ODBC Microsoft Open Database Connectivity (ODBC) is a standard programming interface for application developers and database systems providers. Before ODBC became a defacto standard for Windows programs to interface with database systems, programmers had to use proprietary languages for each database they wanted to connect to. Now, ODBC has made the choice of the database system almost irrelevant from a coding perspective, which is as it should be. Application developers have much more important things to worry about than the syntax that is needed to port their program from one database to another when business needs suddenly change. Through the ODBC Administrator in Control Panel, you can specify the particular database that is associated with a data source that an ODBC application program is written to use. Think of an ODBC data source as a door with a name on it. Each door will lead you to a particular database. For example, the data source named Sales Figures might be a SQL Server database, whereas the Accounts Payable data source could refer to an Access database. The physical database referred to by a data source can reside anywhere on the LAN. The ODBC system files are not installed on your system by Windows 95. Rather, they are installed when you setup a separate database application, such as SQL Server Client or Visual Basic 4.0. When the ODBC icon is installed in Control Panel, it uses a file called ODBCINST.DLL. It is also possible to administer your ODBC data sources through a stand-alone program called ODBCADM.EXE. There is a 16-bit and a 32-bit version of this program and each maintains a separate list of ODBC data sources. From a programming perspective, the beauty of ODBC is that the application can be written to use the same set of function calls to interface with any data source, regardless of the database vendor. The source code of the application doesn’t change whether it talks to Oracle or SQL Server. We only mention these two as an example. There are ODBC drivers available for several dozen popular database systems. Even Excel spreadsheets and plain text files can be turned into data sources. The operating
  20. 20. 18 system uses the Registry information written by ODBC Administrator to determine which low-level ODBC drivers are needed to talk to the data source (such as the interface to Oracle or SQL Server). The loading of the ODBC drivers is transparent to the ODBC application program. In a client/server environment, the ODBC API even handles many of the network issues for the application programmer. The advantages of this scheme are so numerous that you are probably thinking there must be some catch. The only disadvantage of ODBC is that it isn’t as efficient as talking directly to the native database interface. ODBC has had many detractors make the charge that it is too slow. Microsoft has always claimed that the critical factor in performance is the quality of the driver software that is used. In our humble opinion, this is true. The availability of good ODBC drivers has improved a great deal recently. And anyway, the criticism about performance is somewhat analogous to those who said that compilers would never match the speed of pure assembly language. Maybe not, but the compiler (or ODBC) gives you the opportunity to write cleaner programs, which means you finish sooner. Meanwhile, computers get faster every year. JDBC In an effort to set an independent database standard API for Java; Sun Microsystems developed Java Database Connectivity, or JDBC. JDBC offers a generic SQL database access mechanism that provides a consistent interface to a variety of RDBMSs. This consistent interface is achieved through the use of “plug-in” database connectivity modules, or drivers. If a database vendor wishes to have JDBC support, he or she must provide the driver for each platform that the database and Java run on. To gain a wider acceptance of JDBC, Sun based JDBC’s framework on ODBC. As you discovered earlier in this chapter, ODBC has widespread support on a variety of platforms. Basing JDBC on ODBC will allow vendors to bring JDBC drivers to market much faster than developing a completely new connectivity solution. JDBC was announced in March of 1996. It was released for a 90 day public review that ended June 8, 1996. Because of user input, the final JDBC v1.0 specification was released soon after.
  21. 21. 19 The remainder of this section will cover enough information about JDBC for you to know what it is about and how to use it effectively. This is by no means a complete overview of JDBC. That would fill an entire book. JDBC Architecture Two-tier and three-tier Processing Models The JDBC API supports both two-tier and three-tier processing models for database access. Figure 1: Two-tier Architecture for Data Access. In the two-tier model, a Java applet or application talks directly to the data source. This requires a JDBC driver that can communicate with the particular data source being accessed. A user's commands are delivered to the database or other data source, and the results of those statements are sent back to the user. The data source may be located on another machine to which the user is connected via a network. This is referred to as a client/server configuration, with the user's machine as the client, and the machine housing the data source as the server. The network can be an intranet, which, for example, connects employees within a corporation, or it can be the Internet. In the three-tier model, commands are sent to a "middle tier" of services, which then sends the commands to the data source. The data source processes the commands and sends the results back to the middle tier, which then sends them to the user. MIS directors find the three-tier model very attractive because the middle tier makes it possible to maintain control over access and the kinds of updates that can be made to corporate data. Another advantage is that it simplifies the deployment of applications. Finally, in many cases, the three-tier architecture can provide performance advantages.
  22. 22. 20 Figure 2: Three-tier Architecture for Data Access. Until recently, the middle tier has often been written in languages such as C or C++, which offer fast performance. However, with the introduction of optimizing compilers that translate Java byte code into efficient machine-specific code and technologies such as Enterprise JavaBeans™, the Java platform is fast becoming the standard platform for middle-tier development. This is a big plus, making it possible to take advantage of Java's robustness, multithreading, and security features. With enterprises increasingly using the Java programming language for writing server code, the JDBC API is being used more and more in the middle tier of three-tier architecture. Some of the features that make JDBC a server technology are its support for connection pooling, distributed transactions, and disconnected row sets. The JDBC API is also what allows access to a data source from a Java middle tier. In the three-tier model, grievances are sent to a "middle tier" of system, which then sends the commands to the data source i.e., taken care by the operator. The data source processes the commands and sends the results back to the middle tier, which then sends them to the client. The three-tier model is very attractive because the middle tier makes it possible to maintain control over access and the kinds of updates that can be made to corporate data. Another advantage is that it simplifies the deployment of applications.
  23. 23. 21 JDBC Goals Few software packages are designed without goals in mind. JDBC is one that, because of its many goals, drove the development of the API. These goals, in conjunction with early reviewer feedback, have finalized the JDBC class library into a solid framework for building database applications in Java. The goals that were set for JDBC are important. They will give you some insight as to why certain classes and functionalities behave the way they do. The eight design goals for JDBC are as follows: 1. SQL Level API The designers felt that their main goal was to define a SQL interface for Java. Although not the lowest database interface level possible, it is at a low enough level for higher-level tools and APIs to be created. Conversely, it is at a high enough level for application programmers to use it confidently. Attaining this goal allows for future tool vendors to “generate” JDBC code and to hide many of JDBC’s complexities from the end user 2. SQL Conformance SQL syntax varies as you move from database vendor to database vendor. In an effort to support a wide variety of vendors, JDBC will allow any query statement to be passed through it to the underlying database driver. This allows the connectivity module to handle non-standard functionality in a manner that is suitable for its users. 3.JDBC must be implemental on top of common database interfaces The JDBC SQL API must “sit” on top of other common SQL level APIs. This goal allows JDBC to use existing ODBC level drivers by the use of a software interface. This interface would translate JDBC calls to ODBC and vice versa. 4.Provide a Java interface that is consistent with the rest of the Java system Because of Java’s acceptance in the user community thus far, the designers feel that they should not stray from the current design of the core Java system.
  24. 24. 22 5.Keep it simple This goal probably appears in all software design goal listings. JDBC is no exception. Sun felt that the design of JDBC should be very simple, allowing for only one method of completing a task per mechanism. Allowing duplicate functionality only serves to confuse the users of the API. 6. Use strong, static typing wherever possible Strong typing allows for more error checking to be done at compile time; also, less error appear at runtime 7.Keep the common cases simple Because more often than not, the usual SQL calls used by the programmer are simple SELECT’s, INSERT’s, DELETE’s and UPDATE’s, these queries should be simple to perform with JDBC. However, more complex SQL statements should also be possible. Finally we decided to proceed the implementation using Java network And for dynamically updating the cache table we go for MS access database Java ha two things: a programming language and a platform.Java is a high-level programming language.
  25. 25. 23 3.2.Networking 3.2.1 TCP/IP stack The TCP/IP stack is shorter than the OSI one: TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is a connectionless protocol. IP datagram’s The IP layer provides a connectionless and unreliable delivery system. It considers each datagram independently of the others. Any association between datagram must be supplied by the higher layers. The IP layer supplies a checksum that includes its own header. The header includes the source and destination addresses. The IP layer handles routing through an Internet. It is also responsible for breaking up large datagram into smaller ones for transmission and reassembling them at the other end.
  26. 26. 24 UDP: UDP is also connectionless and unreliable. What it adds to IP is a checksum for the contents of the datagram and port numbers. These are used to give a client/server model - see later. TCP: TCP supplies logic to give a reliable connection-oriented protocol above IP. It provides a virtual circuit that two processes can use to communicate. Internet addresses In order to use a service, you must be able to find it. The Internet uses an address scheme for machines so that they can be located. The address is a 32 bit integer which gives the IP address. This encodes a network ID and more addressing. The network ID falls into various classes according to the size of the network address. Network addressClass A uses 8 bits for the network address with 24 bits left over for other addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class D uses all 32. Subnet address Internally, the UNIX network is divided into sub networks. Building 11 is currently on one sub network and uses 10-bit addressing, allowing 1024 different hosts.Host address8 bits are finally used for host addresses within our subnet. This places a limit of 256 machines that can be on the subnet. Total address The 32 bit address is usually written as 4 integers separated by dots.
  27. 27. 25 Port addresses A service exists on a host, and is identified by its port. This is a 16 bit number. To send a message to a server, you send it to the port for that service of the host that it is running on. This is not location transparency! Certain of these ports are"wellknown". Sockets: A socket is a data structure maintained by the system to handle network connections. A socket is created using the call socket. It returns an integer that is like a file descriptor. In fact, under Windows, this handle can be used with Read File and Write File functions. #include <sys/types.h> #include <sys/socket.h> int socket(int family, int type, int protocol); Here "family" will be AF_INET for IP communications, protocol will be zero, and type will depend on whether TCP or UDP is used. Two processes wishing to communicate over a network create a socket each. These are similar to two ends of a pipe - but the actual pipe does not yet exist. JFree Chart: JFreeChart is a free 100% Java chart library that makes it easy for developers to display professional quality charts in their applications. JFreeChart's extensive feature set includes: A consistent and well-documented API, supporting a wide range of chart types; A flexible design that is easy to extend, and targets both server-side and client-side applications; Support for many output types, including Swing components, image files (including PNG and JPEG), and vector graphics file formats (including PDF, EPS and SVG); JFreeChart is "open source" or, more specifically, free software. It is distributed under the terms of the GNU Lesser General Public Licence (LGPL), which permits use in proprietary applications.
  28. 28. 26 1.Map Visualizations Charts showing values that relate to geographical areas. Some examples include: (a) population density in each state of the United States, (b) income per capita for each country in Europe, (c) life expectancy in each country of the world. The tasks in this project include:Sourcing freely redistributable vector outlines for the countries of the world, states/provinces in particular countries (USA in particular, but also other areas); Creating an appropriate dataset interface (plus default implementation), a rendered, and integrating this with the existing XYPlot class in JFreeChart; Testing, documenting, testing some more, documenting some more. 2. Time Series Chart Interactivity Implement a new (to JFreeChart) feature for interactive time series charts --- to display a separate control that shows a small version of ALL the time series data, with a sliding "view" rectangle that allows you to select the subset of the time series data to display in the main chart. 3. Dashboards There is currently a lot of interest in dashboard displays. Create a flexible dashboard mechanism that supports a subset of JFreeChart chart types (dials, pies, thermometers, bars, and lines/time series) that can be delivered easily via both Java Web Start and an applet. 4.Property Editors The property editor mechanism in JFreeChart only handles a small subset of the properties that can be set for charts. Extend (or reimplement) this mechanism to provide greater end-user control over the appearance of the charts.
  29. 29. 27 3.3.TOMCAT WEB SERVER Tomcat is an open source web server developed by Apache Group. Apache Tomcat is the servlet container that is used in the official Reference Implementation for the Java Servlet and JavaServer Pages technologies. The Java Servlet and JavaServer Pages specifications are developed by Sun under the Java Community Process. Web Servers like Apache Tomcat support only web components while an application server supports web components as well as business components (BEAs Weblogic, is one of the popular application server).To develop a web application with jsp/servlet install any web server like JRun, Tomcat etc to run your application. Fig 3.3.1 Tomcat Webserver
  30. 30. 28 4. PROJECT ANALYSIS PROBLEMS IN EXISTING SYSTEM The traditional online shopping business model allows sellers to sell a product or service at a preset price, where buyers can choose to purchase if they find it to be a good deal. Online auction however is a different business model by which items are sold through price bidding. There is often a starting price and expiration time specified by the sellers. Once the auction starts, potential buyers bid against each other, and the winner gets the item with their highest winning bid. The Systems Development Life Cycle (SDLC), or Software Development Life Cycle in System engineering and information and, is the process of creating or altering systems, and the models and methodologies that people use to develop these systems. In software engineering the SDLC concept underpins many kinds of software development methodologies. These methodologies form the framework for planning and controlling the creation of an information system the software development process. SOFTWARE MODEL OR ARCHITECTURE ANALYSIS: Structured project management techniques (such as an SDLC) enhance management’s control over projects by dividing complex tasks into manageable sections. A software life cycle model is either a descriptive or prescriptive characterization of how software is or should be developed. But none of the SDLC models discuss the key issues like Change management, Incident management and Release management processes within the SDLC process, but, it is addressed in the overall project management. In the proposed hypothetical model, the concept of user- developer interaction in the conventional SDLC model has been converted into a three dimensional model which comprises of the user, owner and the developer. WHAT IS SDLC? A software cycle deals with various parts and phases from planning to testing and deploying software. All these activities are carried out in different ways, as per the needs. Each way is known as a Software Development Lifecycle Model (SDLC). A software life cycle model is either a descriptive or prescriptive characterization of
  31. 31. 29 how software is or should be developed. A descriptive model describes the history of how a particular software system was developed. Descriptive models may be used as the basis for understanding and improving software development processes or for building empirically grounded prescriptive models. SDLC models * The Linear model (Waterfall) - Separate and distinct phases of specification and development. - All activities in linear fashion. - Next phase starts only when first one is complete. * Evolutionary development- Specification and development are interleaved (Spiral, incremental, prototype based, Rapid Application development). - Incremental Model (Waterfall in iteration), - RAD(Rapid Application Development) - Focus is on developing quality product in less time, - Spiral Model- We start from smaller module and keeps on building it like a spiral. It is also called Component based development. * Formal systems development- A mathematical system model is formally transformed to an implementation. * Agile Methods.- Inducing flexibility into development. * Reuse-based development- The system is assembled from existing components. SDLC Methodology: Spiral Model The spiral model is similar to the incremental model, with more emphases placed on risk analysis. The spiral model has four phases: Planning, Risk Analysis, Engineering and Evaluation. SPIRAL MODEL was defined by Barry Boehm in his 1988 article, “A spiral Model of Software Development and Enhancement. This model was not the first model to discuss iterative development, but it was the first model to explain why the iteration models. The steps for Spiral Model can be generalized as follows:  The new system requirements are defined in as much details as possible. This usually involves interviewing a number of users representing all the external or internal users and other aspects of the existing system.  A preliminary design is created for the new system.
  32. 32. 30  A first prototype of the new system is constructed from the preliminary design. This is usually a scaled-down system, and represents an approximation of the characteristics of the final product.  A second prototype is evolved by a fourfold procedure:  Evaluating the first prototype in terms of its strengths, weakness, and risks.  Defining the requirements of the second prototype.  Planning an designing the second prototype.  Constructing and testing the second prototype.  At the customer option, the entire project can be aborted if the risk is deemed too great. Risk factors might involved development cost overruns, operating-cost miscalculation, or any other factor that could, in the customer’s judgment, result in a less-than-satisfactory final product.  The existing prototype is evaluated in the same manner as was the previous prototype, and if necessary, another prototype is developed from it according to the fourfold procedure outlined above.  The preceding steps are iterated until the customer is satisfied that the refined prototype represents the final product desired.  The final system is constructed, based on the refined prototype.
  33. 33. 31 Fig 4.1-Spiral Model Advantages  High amount of risk analysis  Good for large and mission-critical projects.  Software is produced early in the software life cycle.
  34. 34. 32 4.1 Existing System The traditional online shopping business model allows sellers to sell a product or service at a preset price, where buyers can choose to purchase if they find it to be a good deal. Online auction however is a different business model by which items are sold through price bidding. There is often a starting price and expiration time specified by the sellers. Once the auction starts, potential buyers bid against each other, and the winner gets the item with their highest winning bid. Disadvantages  Product expiration time will be there  Product does not deliverd to the customer because there is no fraud detection  Without registration sellors are interacted. 4.2 Proposed System we propose an online probit model framework which takes online feature selection, coefficient bounds from human knowledge and multiple instance learning into account simultaneously. By empirical experiments on a real-world online auction fraud detection data we show that this model can potentially detect more frauds and significantly reduce customer complaints compared to several baseline models and the human-tuned rule-based system. Human experts with years of experience created many rules to detect whether a user is fraud or not. If the fraud score is above a certain threshold, the case will enter a queue for further investigation by human experts. Once it is reviewed, the final result will be labeled as boolean, i.e. fraud or clean. Cases with higher scores have higher priorities in the queue to be reviewed. The cases whose fraud score are below the threshold are determined as clean by the system without any human judgment.
  35. 35. 33 ADVANTAGES  There is no time limit or day limit.  Time saving DISADVANTAGE:  Purchased product may not be delivered. 4.2.1 SOLUTION OF THESE PROBLEMS we propose an online probit model framework which takes online feature selection, coefficient bounds from human knowledge and multiple instance learning into account simultaneously. By empirical experiments on a real-world online auction fraud detection data we show that this model can potentially detect more frauds and significantly reduce customer complaints compared to several baseline models and the human-tuned rule-based system. Human experts with years of experience created many rules to detect whether a user is fraud or not. If the fraud score is above a certain threshold, the case will enter a queue for further investigation by human experts. Once it is reviewed, the final result will be labeled as boolean, i.e. fraud or clean. Cases with higher scores have higher priorities in the queue to be reviewed. The cases whose fraud score are below the threshold are determined as clean by the system without any human judgment.
  36. 36. 34 4.3 MODULES • Rule-based features: Human experts with years of experience created many rules to detect whether a user is fraud or not. An example of such rules is “blacklist”, i.e. whether the user has been detected or complained as fraud before. Each rule can be regarded as a binary feature that indicates the fraud likeliness. • Selective labeling: If the fraud score is above a certain threshold, the case will enter a queue for further investigation by human experts. Once it is reviewed,the final result will be labeled as boolean, i.e. fraud or clean. Cases with higher scores have higher priorities in the queue to be reviewed. The cases whose fraud score are below the threshold are determined as clean by the system without any human judgment. • Fraud churn: Once one case is labeled as fraud by human experts, it is very likely that the seller is not trustable and may be also selling other frauds; hence all the items submitted by the same seller are labeled as fraud too. The fraudulent seller along with his/her cases will be removed from the website immediately once detected • User Complaint: Buyers can file complaints to claim loss if they are recently deceived by fraudulent sellers. The Administrator view the various type of complaints and the percentage of various type complaints. The complaints values of a products increase some threshold value the administrator set the trustability of the product as Untrusted or banded. If the products set as banaded, the user cannot view the products in the webs.
  37. 37. 35 5. PROJECT DESIGN 5.1 UML DIAGRAMS A Diagram is the graphical presentation of a set of elements, most often rendered as a connected graph of vertices (things) and arcs (relationships).For this reason, and the UML includes nine such diagrams. The Unified Modeling Language (UML) is probably the most widely known and used notation for object-oriented analysis and design. It is the result of the merger of several early contributions to object-oriented methods. The Unified Modeling Language (UML) is a standard language for writing software blueprints? The UML may be used to visualize, specify, construct, and document the artifacts. A Modeling language is a language whose vocabulary and rules focus on the conceptual and physical representation of a system. Modeling is the designing of software applications before coding. Modeling is an Essential Part of large software projects, and helpful to medium and even small projects as well. A model plays the analogous role in software development that blueprints and other plans (site maps, elevations, physical models) play in the building of a skyscraper. Using a model, those responsible for a software development project's success can assure themselves that business functionality is complete and correct, end-user needs are met, and program design supports requirements for scalability, robustness, security, extendibility, and other characteristics, before implementation in code renders changes difficult and expensive to make. The underlying premise of UML is that no one diagram can capture the different elements of a system in its entirety. Hence, UML is made up of nine diagrams that can be used to model a system at different points of time in the software life cycle of a system. The nine UML diagrams are: Use case diagram The use case diagram is used to identify the primary elements and processes that form the system. The primary elements are termed as "actors" and the processes are called "use cases." The use case diagram shows which actors interact with each use case.
  38. 38. 36 Class diagram The class diagram is used to refine the use case diagram and define a detailed design of the system. The class diagram classifies the actors defined in the use case diagram into a set of interrelated classes. The relationship or association between the classes can be either an "is-a" or "has-a" relationship. Each class in the class diagram may be capable of providing certain functionalities. These functionalities provided by the class are termed "methods" of the class. Apart from this, each class may have certain "attributes" that uniquely identify the class. Object diagram The object diagram is a special kind of class diagram. An object is an instance of a class. This essentially means that an object represents the state of a class at a given point of time while the system is running. The object diagram captures the state of different classes in the system and their relationships or associations at a given point of time. State diagram A state diagram, as the name suggests, represents the different states that objects in the system undergo during their life cycle. Objects in the system change states in response to events. In addition to this, a state diagram also captures the transition of the object's state from an initial state to a final state in response to events affecting the system. Activity diagram The process flows in the system are captured in the activity diagram. Similar to a state diagram, an activity diagram also consists of activities, actions, transitions, initial and final states, and guard conditions. Sequence diagram A sequence diagram represents the interaction between different objects in the system. The important aspect of a sequence diagram is that it is time-ordered. This
  39. 39. 37 means that the exact sequence of the interactions between the objects is represented step by step. Different objects in the sequence diagram interact with each other by passing "messages". Collaboration diagram A collaboration diagram groups together the interactions between different objects. The interactions are listed as numbered interactions that help to trace the sequence of the interactions. The collaboration diagram helps to identify all the possible interactions that each object has with other objects. Component diagram The component diagram represents the high-level parts that make up the system. This diagram depicts, at a high level, what components form part of the system and how they are interrelated. A component diagram depicts the components culled after the system has undergone the development or construction phase. Deployment diagram The deployment diagram captures the configuration of the runtime elements of the application. This diagram is by far most useful when a system is built and ready to be deployed. Now that we have an idea of the different UML diagrams, let us see if we can somehow group together these diagrams to enable us to further understand how to use them.
  40. 40. 38 SYSTEM DESIGN SYSTEM DESIGN :( Admin) Admin:Fig 5.1.1 Flow chart for Admin authorize sellers
  41. 41. 39 Data Flow Diagram / Use Case Diagram / Flow Diagram The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of the input data to the system, various processing carried out on these data, and the output data is generated by the system. Seller: Fig 5.1.2 flow chart for how seller purchase products
  42. 42. 40 User: Fig 5.1.3 flow chart for how user purchase products
  43. 43. 41 Usecase diagram Fig 5.1.4 Use case diagram for fraud detection
  44. 44. 42 Sequence diagram Fig 5.1.5 sequence diagram for fraud detection
  45. 45. 43 Class diagram Fig 5.1.6 class diagram for login page
  46. 46. 44 Activity diagram 5.1.7 Activity diagram for user,admin,seller login
  51. 51. 49 6. PROJECT TESTING 6.1 INTRODUCTION OF TESTING Testing is the process of detecting errors. Testing performs a very critical role for quality assurance and for ensuring the reliability of software. The results of testing are used later on during maintenance also. Psychology of Testing The aim of testing is often to demonstrate that a program works by showing that it has no errors. The basic purpose of testing phase is to detect the errors that may be present in the program. Hence one should not start testing with the intent of showing that a program works, but the intent should be to show that a program doesn’t work. Testing is the process of executing a program with the intent of finding errors. 6.2 TESTING OBJECTIVES The main objective of testing is to uncover a host of errors, systematically and with minimum effort and time. Stating formally, we can say,  Testing is a process of executing a program with the intent of finding an error.  A successful test is one that uncovers an as yet undiscovered error.  A good test case is one that has a high probability of finding error, if it exists.  The tests are inadequate to detect possibly present errors.  The software more or less confirms to the quality and reliable standards.
  52. 52. 50 6.3 Types of Testing: There are various types of testing.  Unit Testing  System Testing  Acceptance Testing  Integration testing Unit Testing: In unit testing the analyst‘s tests the programs making up a system. The software units in the system are the modules and routines that are assembled and integrated to perform a specific function. In large system, many modules at different levels are needed. Unit Testing individually focuses on the modules. This enables the tester to detect errors in coding and logic that are contained within the module alone. The errors resulting from the interaction between the modules are initially ignored. Unit testing can be performed in bottom-up approach, starting with smallest/lowest module and processing one at a time. For each module in bottom-up testing, a short program executes the module and provides the needed data, so that the module is asked to perform the way it will be embedded within the larger system. When bottom level modules are tested, attention turns to those on next level that uses these lower level ones. They are tested individually and then linked with the previously examined lower level modules. Top-down testing, as the name implies, begins with the upper-level modules. However, since the detailed activities are usually performed in lower-level routines, that are not provided, studs are written. A stud is module shell that can be called by the upper-level module and that, when reach properly, will return a message to calling module, indicating the correctness of lower-level module. Often top-down testing are combined with bottom-up testing, that is, some lower-level module are unit tested and integrated into top-down testing program.
  53. 53. 51 System Testing: Systems testing do not test the software per se but rather the integration of each module in the system. It also tests to find discrepancies between the system and its original objectives, current sections and system documentation. The primary concern is the compatibility of individual modules. Analysts are trying to find areas where modules have been designed with different specifications for data length, type and data element name. Acceptance Testing: The software has been tested with the realistic data given by the client and produced fruitful results. The client satisfying all the requirements specified by them has also developed the software within the time limitation specified. A demonstration has been given to the client and the end-user giving all the operational features. Integration Testing: Integration testing is a systematic technique for constructing the program structure while conducting tests to uncover errors associated with interfacing. The objective is to take unit tested modules and build a program structure that has been dictated by design. Code Testing: This strategy examines the logic of the program. To follow this method we developed some test data that resulted in executing every instruction in the program and module i.e. every path is tested. Systems are not designed as entire nor are they tested as single systems. To ensure that the coding is perfect two types of testing is performed or for that matter is performed or that matter is performed or for that matter is performed on all systems.
  54. 54. 52 Types of Testing 1. Black box or functional testing 2. White box testing or structural testing Black box testing This method is used when knowledge of the specified function that a product has been designed to perform is known . The concept of black box is used to represent a system whose inside workings are not available to inspection . In a black box the test item is a "Black" , since its logic is unknown , all that is known is what goes in and what comes out , or the input and output. Black box testing attempts to find errors in the following categories:  Incorrect or missing functions  Interface errors  Errors in data structure  Performance errors  Initialization and termination errors As shown in the following figure of Black box testing , we are not thinking of the internal workings , just we think about What is the output to our system? What is the output for given input to our system? Input Output The Black box is an imaginary box that hides its internal workings ?
  55. 55. 53 White box testing White box testing is concerned with testing the implementation of the program. the intent of structural is not to exercise all the inputs or outputs but to exercise the different programming and data structure used in the program. Thus structural testing aims to achieve test cases that will force the desire coverage of different structures . Two types of path testing are statement testing coverage and branch testing coverage. Input output The White Box testing strategy , the internal workings Test Plan Testing process starts with a test plan. This plan identifies all the testing related activities that must be performed and specifies the schedules , allocates the resources , and specified guidelines for testing . During the testing of the unit the specified test cases are executed and the actual result compared with expected output. The final output of the testing phase is the test report and the error report. Test Data: Here all test cases that are used for the system testing are specified. The goal is to test the different functional requirements specified in Software Requirements Specifications (SRS) document. Unit Testing: Each individual module has been tested against the requirement with some test data. Test Report: The module is working properly provided the user has to enter information. All data entry forms have tested with specified test cases and all data entry forms are working properly. Error Report: If the user does not enter data in specified order then the user will be prompted with error messages. Error handling was done to handle the expected and unexpected errors. INTERNAL WORKING
  56. 56. 54 7. SYSTEM STUDY The feasibility of the project is analyzed in this phase and business proposal is put forth with a very general plan for the project and some cost estimates. During system analysis the feasibility study of the proposed system is to be carried out. This is to ensure that the proposed system is not a burden to the company. For feasibility analysis, some understanding of the major requirements for the system is essential. Three key considerations involved in the feasibility analysis are  ECONOMICAL FEASIBILITY  TECHNICAL FEASIBILITY  SOCIAL FEASIBILITY ECONOMICAL FEASIBILITY This study is carried out to check the economic impact that the system will have on the organization. The amount of fund that the company can pour into the research and development of the system is limited. The expenditures must be justified. Thus the developed system as well within the budget and this was achieved because most of the technologies used are freely available. Only the customized products had to be purchased. TECHNICAL FEASIBILITY This study is carried out to check the technical feasibility, that is, the technical requirements of the system. Any system developed must not have a high demand on the available technical resources. This will lead to high demands on the available technical resources. This will lead to high demands being placed on the client. The developed system must have a modest requirement, as only minimal or null changes are required for implementing this system.
  57. 57. 55 SOCIAL FEASIBILITY The aspect of study is to check the level of acceptance of the system by the user. This includes the process of training the user to use the system efficiently. The user must not feel threatened by the system, instead must accept it as a necessity. The level of acceptance by the users solely depends on the methods that are employed to educate the user about the system and to make him familiar with it. His level of confidence must be raised so that he is also able to make some constructive criticism, which is welcomed, as he is the final user of the system.
  58. 58. 56 8. OUTPUT SCREENS 8.1 Home page: Fig 8.1 Home Page of Fraud Detection in Online Auctioning  This is the home page of the fraud detection in online auctioning  Here some options will be appeared like registration of both user and seller.  Also user login option will be there for login purpose.  By using home page seller and user can be interacted.
  59. 59. 57 8.2 Seller Registration page Fig 8.2 Seller Registration form  Here seller is registered before inserting the product into online shopping..  Here seller need to give all details like seller name, company name, user id, mobile number in the registration form.  After registration is completed seller product can be inserted in to the online shoping.  Seller registration is successful if admin authorizes.
  60. 60. 58 8.3 User registration page Fig 8.3 User Registration form  This is the home page where user should register first.  After registration is completed then user can view all product details like cost,offer etc  If user feels it as a good deal on any product then he can purchase the product directly.
  61. 61. 59 8.4 seller login page Fig8.4 seller login page  Here seller should first login then after that he can insert product into the web page.  Here seller need to give datails like seller id, password and then press submit button.  When seller logged into the page then some products and options will be appeared like product warrenty days and offer, phone number,descriptions..
  62. 62. 60 8.5 User loginpage Fig8.5 User login  This is the user login page where user need to give user name and password to login.  Once user logs in then he can view all product details.  If he is interested in purchasing any product he can purchase directly.
  63. 63. 61 8.6 Admin login page Fig8.6 Admin login page  This is the admin home page .  Admin is an authorized person for controlling fraud detection.  Admin checks each seller registration details and if seller details is found to be correct then only admin authorizes the registration request.  If seller details is not correct then access is denied. 9. CONCLUSION
  64. 64. 62 we build online models for the auction fraud moderation and detection system designed for a major Asian online auction website. By empirical experiments on a real world online auction fraud detection data, we show that our proposed online probit model framework, which combines online feature selection, bounding coefficients from expert knowledge and multiple instance learning, can significantly improve over baselines and the human-tuned model. Note that this online modeling framework can be easily extended to many other applications, such as web spam detection, content optimization and so forth. FUTURE ENHANCEMENT Regarding to future work, one direction is to include the adjustment of the selection bias in the online model training process. It has been proven to be very effective for offline models in [38]. The main idea there is to assume all the unlabeled samples have response equal to 0 with a very small weight. Since the unlabeled samples are obtained from an effective moderation system, it is reasonable to assume that with high probabilities they are non-fraud. Another future work is to deploy the online models described in this paper to the real production system, and also other applications 10.BIBILOGRAPHY
  65. 65. 63 1. D. Agarwal, B. Chen, and P. Elango. Spatio-temporal models for estimating click-through rate. In Proceedings of the 18th international conference on World wide web, pages 21–30. ACM, 2009. 2. S. Andrews, I. Tsochantaridis, and T. Hofmann.Support vector machines for multiple-instance learning. Advances in neural information processing systems, pages 577–584, 2003. 3. C. Bliss. The calculation of the dosage-mortalitycurve. Annals of Applied Biology, 22(1):134–167, 1935. 4. Borodin and R. El-Yaniv. Online computation and competitive analysis, volume 53. Cambridge University Press New York, 1998. 5. L. Breiman. Random forests. Machine learning, 45(1):5–32, 2001. 6. R. Brent. Algorithms for minimization without derivatives. Dover Pubns, 2002. 7. D. Chau and C. Faloutsos. Fraud detection in electronic auction. In European Web Mining Forum (EWMF 2005), page 87. 8. H. Chipman, E. George, and R. McCulloch. Bart: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1):266–298, 2010. 9. W. Chu, M. Zinkevich, L. Li, A. Thomas, and B. Tseng. Unbiased online active learning in data streams. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 195–203. ACM, 2011. 10. C. Chua and J. Wareham. Fighting internet auction fraud: An assessment and proposal. Computer, 37(10):31–37, 2004. 11. R. Collins, Y. Liu, and M. Leordeanu. Online selection of discriminative tracking features. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1631–1643, 2005. 12. N. Cristianini and J. Shawe-Taylor. An introduction to support Vector Machines: and other kernel-based learning methods. Cambridge university press, 2006. 13. T. Dietterich, R. Lathrop, and T. Lozano-P´erez. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89(1-2):31–71, 1997. 14. Federal Trade Commission. Internet auctions: A guide for buyers and sellers., 2004.
  66. 66. 64 15. J. Friedman. Stochastic gradient boosting.Computational Statistics & Data Analysis,38(4):367–378, 2002. SITES REFERRED: 1. 2. 3. 4. 5.