Successfully reported this slideshow.

Introduction of Trustie Software Repository & Passion-Lab Data Center, OW2con'12, Paris


Published on

The existing large amount of OSS artifacts has provided abundant materials for understanding how code is reused in open source universe, in particular, what code pieces are mostly reused, in what circumstances people reuse code, and so forth. Understanding this process could help with legacy software maintenance, as well as help to explore best practice of software development. Targeting the change history data of thousands of open source projects, we try to answer the following question: First, how is code reused by other projects? Second, how are code files organized in project and how does this organization structure change over time? To answer these questions, there are several technical difficulties we have to overcome. For example, because of the different kinds of VCSs, it is hard to figure out a uniform model which can represent the evolution progress of code files stored in them. Also, each VCS may have its own data format, so, extracting data from them is a big challenge. Furthermore, using current software algorithm and hardware platform to analyze the version iteration and reuse information of about a billion code files is another challenge.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Introduction of Trustie Software Repository & Passion-Lab Data Center, OW2con'12, Paris

  1. 1. Introduction of Trustie Software Repository & Passion-Lab Data Center Meng Li & Minghui Zhou Peking University, Beijing, China Software Institute, School of Electronics Engineering and Computer Science, Peking University Key Laboratory of High Confidence Software Technologies, Ministry of Education 12/4/12
  2. 2. Agenda• Introduction of TSR• Trustworthiness Evaluation in TSR• Passion-Lab Data Center 12/4/12
  3. 3. Introduction of TSR• Background – Software reuse aims to improve software development via reusing software resources (SR). – Typical SR: components, Web services, tools, architectures, etc. – Software resource repository is the infrastructure that pro vides the SR management mechanism to support software r euse. Code – Such as publishing, retrieving, classification, storage, feedback, evaluati on, etc. API Software Resource Repository JAR A1 A3 A Web A2 R A4 B B1 Services B2 12/4/12
  4. 4. Introduction of TSR• About TSR – Trustie Software Resource Repository (TSR) is to provid e mechanism to describe, collect, evaluate, classify and ma nage SR’ trustworthiness, to support trust software develo pment. – TSR is part of Trustie Project (Major 863 project, China) – Trustie Project: – TSR: – TSR is developed by Peking University 12/4/12
  5. 5. Introduction of TSR• Functionalities of TSR – Resource management – Publishing/editing/downloading/deleting resources – Retrieving resources – Recommending resources – User management – Registration, sign in/out, etc. – Feedback – Rating or comment – Quality template-based feedbacks – Tag management – Trustworthiness evaluation – Statistics 12/4/12
  6. 6. Introduction of TSR• TSR Home Page 12/4/12
  7. 7. Introduction of TSR• Architecture of TSR 12/4/12
  8. 8. Introduction of TSR• Application of TSR – TSR has been deployed in several Software Incubators & C ompanies throughout China. – Installation for OW2 in Grenoble, France (in Nov. 2012). 12/4/12
  9. 9. Agenda• Introduction of TSR• Trustworthiness Evaluation in TSR• Passion-Lab Data Center 12/4/12
  10. 10. Trustworthiness Evaluation in TSR• Trustworthiness Evaluation in TSR 12/4/12
  11. 11. Trustworthiness Evaluation in TSR• Trustworthiness Evaluation Methods – Feedback-Based Trustworthiness Evaluation (FBTE) – Internet-Based Trustworthiness Evaluation (IBTE) 12/4/12
  12. 12. Feedback-Based Trustworthiness Evaluation (FBTE)• Rationales behind FBTE: – When I want to find and reuse a trustworthy SR, the most straightforward way is to choose SRs that are iden tified trustworthy by other users. – In other words, users’ feedbacks are an important kind of trustworthy evidences. – Moreover, if the user is trustworthy (e.g. expert), his feedbac ks is more likely to be trustworthy.• We collect users’ feedbacks in TSR, and provide FBT E to help users to find and reuse SRs. 12/4/12
  13. 13. Feedback-Based Trustworthiness Evaluation (FBTE)• FBTE is based on the feedback functionality in TSR – Simple feedback: rating & comment – Template-based feedback: detailed feedback (stability, secu rity, portability, etc.) 12/4/12
  14. 14. Feedback-Based Trustworthiness Evaluation (FBTE)• FBTE Procedure: a) TSR admin assign trustworthiness evaluation experts; b) Experts evaluate the trustworthiness of a SR by providing t emplate-based feedback; c) TSR admin confirm the feedbacks, and the latest overall ra ting is taken as SR’s trustworthiness level.• FBTE is straightforward but widely used in software r epositories/portals. 12/4/12
  15. 15. Internet-Based Trustworthiness Evaluation (IBTE)• Background – SRs are diversifying – From closed, static, code – To open, dynamic, service – With the development of Web 2.0, there are more and more informatio n on the Internet – Feedbacks on different websites • E.g. comments and ratings on Seekda, Service-finder• We try to collect trustworthy evid ences from the Internet and eval uate the trustworthiness of SRs i n TSR with them. 12/4/12
  16. 16. Internet-Based Trustworthiness Evaluation (IBTE)• Framework Overview Collecting SR List SR Information TSR Related Info d from Internet te ela f R tion Tr Me wort tion nt o a n us orm eme od rm ctio Inf nag t h o h in t Ma h et fo le d o es s M In Col Trustworthiness evaluation f a system TrustworthyTrustworthy Trustworthiness Evidence Level Evaluation Model xtraction Extraction SR fevidence e Method o Item of Item of Evidence SR Evidence Tr Met List us ho t d Ev wort of alu hin Trustworthiness at i es Evidence on s Evaluation Storage 12/4/12
  17. 17. Internet-Based Trustworthiness Evaluation (IBTE)• Taking Web Services for example. 12/4/12
  18. 18. Internet-Based Trustworthiness Evaluation (IBTE)• Collecting trustworthy evidences from the Internet – Objective evidences (QoS) – Implement a QoS monitor, invoke Web services on a regular basis. – Over 2 million records – Subjective evidences (Reputation) – Calculated with users’ feedbacks including ratings and comments NO. Comment Result – For comments, we first apply sentiment analysis 1 It works perfectly. Positive (+1) 2 Cool service! Positive (+1) 3 No data returned Negative (-1) Gave a wrong result ("false") on a 4 Negative (-1) valid email address. Useless. 5 It’s not working. Negative (-1) 12/4/12
  19. 19. Internet-Based Trustworthiness Evaluation (IBTE)• Demo 12/4/12
  20. 20. Agenda• Introduction of TSR• Trustworthiness Evaluation in TSR• Passion-Lab Data Center 12/4/12
  21. 21. What are the best practices in OSS?• Research question: – How people reuse code over time? – Which code is reused most often? – What attributes they have?• Build infrastructure to review code reuse across OSS universe 12/4/12
  22. 22. Infrastructure of Big Data 12/4/12
  23. 23. Infrastructure• We keep tracking various commercial and open sourc e projects.• This “universal” repository records data from: – Version control – Issue tracking – Email archives – …… 12/4/12
  24. 24. Hardware and data levels• Machines – DELL R910(4U), 64GbRAM, 16-cores X7550 – DELL MD3200, 12*2TB SAS – DELL R710 * 4, 64GbRAM• Data Levels – Level0: raw data – Level1: filtered data – Level2-n: customized data 12/4/12
  25. 25. What we could do with this data?• To enable better eco-system, user experience, and pr actices … … – Understand the past, predict the future HTTP:// A cloud for OSS best practices based on issue tracking data, version control data and … 12/4/12
  26. 26. Related Websites• Trustie Software Repository• CoWS Web Services Search Engine• Trustworthiness Evaluation• API Example• Passion-Lab All the systems are• TSR @ OW2 Open Source 12/4/12
  27. 27. Thank You!Merci!谢谢 ! 12/4/12