Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

FWD50 2018 - Key Characteristics to Digital Transformation


Published on

Natalie Evans Harris (BrightHive, Inc)

Published in: Government & Nonprofit
  • Be the first to comment

  • Be the first to like this

FWD50 2018 - Key Characteristics to Digital Transformation

  1. 1. Natalie Evans Harris Co-Founder and COO, BrightHive Co-Ideator, Community-driven Principles for Ethical Data Practices (CPEDS) Former Senior Policy Advisor to US Chief Technology Officer, Obama Administration Key Characteristics to Digital Transformation
  2. 2. Former Senior Policy Advisor to US Chief Technology Officer, Obama Administration I Believe, that through our collective power, data can transform the human experience Co-Founder and COO, BrightHive Co-Ideator, Global Data Ethics Project (GDEP) Who Am I? @QuietStormnat LinkedIn: nevansharris Tea/Wine: Technology People Policy
  3. 3. Why Digitize Services - The World is Changing Transformed Services focus on: ★ The Journey: An end-to-end digital experience developed from the customer’s point of view, accessible anywhere, anytime, and from any device. ★ Protect Access: A unique, uniform digital ID that grants agencies access to the appropriate data and services. ★ Cross-Agency Collaboration: Mechanisms that allow agencies to share data across the state enterprise.
  4. 4. Transformation requires an examination of government: Policies - Privacy frameworks, governance models (i.e., GDPR, data sharing agreements) Processes - digital literacy, bauracracy (i.e., training, ecosystems) Technology - Agile procurement, development, delivery (i.e., open technical standards) Digital state today produces 2.5 quintillion bytes of data daily (Forbes, May 2018)
  5. 5. Challenges to using Data for Social Good
  6. 6. Building Trust requires: Empowering People to speak on data use, establishing Processes to clarify practices, and leveraging Technology to hold us accountable Trust? Balancing Individual Rights with Social Impact we must define the "common good" and begin again to shape a common future. Barbara Charlie Jordan, Former House of Representative, Texas 1976 Democratic National Convention Keynote Address
  7. 7. Global Data Ethics Project By Data Scientists, For Data Scientists: Increases responsiveness to the needs and concerns of data scientists Better captures the diverse spectrum of interests across the data science community May facilitate adoption of the code of ethics **Formerly Community-driven Principles for Ethical Data Sharing (CPEDS)
  8. 8. As a community, what’s important is… Informed and purposeful consent Guarantee the security of data, subjects, and algorithms Protect anonymous data subjects Transparency as the default Foster diversity Acknowledge and mitigate unfair bias Clearly established provenance Respect relevant tensions Communicate responsibly and accessibly Exercise Our Ethical Imagination!!
  9. 9. My Goal for this Workshop You walk away with new information that empowers you to take specific actions that improve your organization’s ability to digitally improve social outcomes
  10. 10. Layers to Digital Transformation Ecosystems People (skills, culture) Technology (infrastructure, solutions)
  11. 11. Characteristic 1: Technology Current Infrastructure for Identifying, Improving, and Investing in Impact Social Service Data Data-Driven Action Lack of Internal Analytics Capacity Lack of Standard Metrics and Algorithms Data Silos Within and Across Verticals
  12. 12. Digital State needs an Infrastructure that is Open and Collaborative Social Service Data Social Impact Provide Intuitive Views to Outcomes and Support Existing BI Tools Provide Standard Outcomes, KPIs, and Advanced Analytics Provide Vendor-neutral Platform for Secure Data Access and Integration
  13. 13. Civic Data Infrastructure Collaborated with AISP, MetroLab and Casey Foundation to answer this question: What are the key considerations in building and sustaining civic data infrastructures and the various technology approaches that may be helpful in overcoming challenges in data integration? A digital version of this report with resource web links can be viewed and downloaded as a PDF here:
  14. 14. Pain Points Lack of Agility ● Dependence on vendor to run inquiries - Procurement processes should require data be interoperable and accessible ● Support linking new streams Lack of Standards ● Lack of data descriptions the further up the data lifecycle Lack of Government Alignment ● Shared procurement to reduce costs and increase interoperability ● Improved governance to streamline legal and political challenges ● Reduce duplication of efforts
  15. 15. Technology for Civic Data Integration The purpose of this report is to describe key considerations in building and sustaining IDS and the various technology approaches that may be helpful in overcoming challenges in data integration. Consideration 0: Staffing Expertise Consideration 3: Data Collection Consideration 1: Data Management Consideration 4: Data Storage Consideration 2: Security and Privacy Consideration 5: Data Linking Consideration 6: Data Access and Dissemination
  16. 16. Characteristic 2: People Roger D. Peng and Elizabeth Matsui The Art of Data Science ence Building public sector capacity to turn data into valuable insights that enable informed action
  17. 17. Data literacy is the ability to read, create and communicate data as meaningful information Data literacy program: ● must be agile and adaptive ● must view data literacy as a continuum ● must empower people
  18. 18. Critical Skills
  19. 19. The critical skills for the big data future look like the skills of the little data past *Ability to connect people and organizations *Ability to build and lead cross-sector coalitions *Ability to communicate clearly and sell the big vision *P-values: Patience, Persistence, Perseverance *Seeing the opportunities through the risks *Openness benefits all
  20. 20. *The Mythical Data Scientist
  21. 21. Ethical Culture Self sovereignty and informed consent: Empowers individuals to control their own data and determine its uses. Cooperation: Promotes collaboration between people and institutions Transparency & Openness: The origins and ownership are clear and workings are intelligible to non experts; information defaults to being open and free. Decentralization: Ownership, production, and control are distributed and driven by a community; default to open source. Flexibility: Easy for users to modify, adapt, improve, or inspect its core; Individuals and institutions may freely choose to use it or give it up Redundancy: More than one solution to every data and technology problem. No monopolies or “one platform to rule them all” Efficiency: Minimizes new resource requirements and personnel costs to realize impacts
  22. 22. Ethical Practices - Data Science Oath I recognize that data science has material consequences for individuals and society, so no matter what project or role I pursue, I will use my skills for their well-being. I will consider the privacy, dignity and fair treatment of individuals when selecting the data permissible in a given application, putting those considerations above the model’s performance. I have a responsibility to bring data transparency, accuracy and access to consumers, including making them aware of how their personal data is being used. I will act deliberately to ensure the security of data and promote clear processes and accountability for security in my organization. I will invest my time and promote the use of resources in my organization to monitor and test data models for any unintended social harm that the modeling may cause.
  23. 23. Ethical Design Checklist ❏ Have we listed how this technology can be attacked or abused? ❏ Have we tested our training data to ensure that it is fair and representative? ❏ Have we studied and understood possible sources of bias in our data? ❏ Does our team reflect diversity of opinions, backgrounds, and kinds of thought? ❏ What kind of user consent do we need to collect and use the data? ❏ Do we have a mechanism for gathering consent from users? ❏ Have we explained clearly what users are consenting to? ❏ Do we have a mechanism for redress, if people are harmed by the results? ❏ Can we shut this software down in production if it is behaving badly? ❏ Have we tested for fairness with respect to different user groups? ❏ Have we tested for disparate error rates among different user groups? ❏ Do we test and monitor for model drift to ensure our software remains fair over time? ❏ Do we have a plan to protect and secure user data?
  24. 24. A network of 6 service providers in Richmond, VA wanted to know who their overlapping populations where in order to better coordinate preventative care for superutilizing individuals. Ethical Challenges: 1. Guarantee that the sharing of information across service providers is limited to the minimal 2. Ensure algorithms predicting risks for high need population are transparent and unbiased Identifying Overlapping Services and Coordinating Service Provision Across Networks of Service Providers Understands the societal outcomes and ethical implications
  25. 25. The biggest barrier to scaling new and productive uses of data across institutions came down to 5 words: Trust Power (Dis)Incentives Accountability Relevance
  26. 26. GOVERNMENT VOLUNTEER DATA COMMUNITY Nonprofits & Community Organizations FOUNDATIONSIndustry UNIVERSITIES Collaboratively Develop Data Standards & Interoperable Tools Provide Technical Assistance and Change Management Services Define Shared Needs Assessment & Use Case Discovery Characteristic 3: Ecosystem for Social Impact
  27. 27. People with enough experience with the workings of a community to understand its technology needs, and enough experience with technology to take leadership in addressing those needs.
  28. 28. Ecosystem Data Maturity Venn Diagram Visionary Leadership Service Delivery Staff TechnologistsPolicy Wonks
  29. 29. Principle 1: Focus on High-Impact Stakeholder Use Cases Principle 2: Promote Web 3.0 Convergence Principle 3: Foster Open Collaboration Principle 4: Develop Open Technical Standards and Protocols Principle 5: Utilize Open Competency Frameworks, Taxonomies, and Ontologies Principle 6: Empower Individuals and Enable Self-Sovereign Identity and Data Management Principle 7: Facilitate Open Data Access in Public-Private Data Infrastructure Principle 8: Promote Ethical Practices as well as Equity Considerations T3 Network Guiding Principles
  30. 30. 1. Promoting and gaining acceptance of the T3 Guiding Principles 1. Exploring and developing a public-private data and technology infrastructure that includes: ○ Public-private data standards (WG2) ○ An open and distributed competency data and technology infrastructure (WG3) ○ New architectures and uses of individual-linked data (WG4) 3. Developing and promoting participation in high-impact projects that address the most critical stakeholder use cases (WG1) 3. Convening stakeholders to review progress, share information, and develop new initiatives Future of the T3 Innovation Network
  31. 31. Work Groups Work Group 1: Stakeholder Use Cases for Achieving Breakthrough Innovations Work Group 2: Exploring Sustainable Data Standards Convergence Work Group 3: Developing and Analyzing Competencies Work Group 4: New Architectures and Uses of Linked Individual-Level Data T3 Innovation Network
  32. 32. We are a people in a quandary about the present. We are a people in search of our future. We are a people in search of a national community. We are a people trying not only to solve the problems of the present, unemployment, inflation, but we are attempting on a larger scale to fulfill the promise of America. We are attempting to fulfill our national purpose, to create and sustain a society in which all of us are equal.” Barbara Charlie Jordan, Former House of Representative, Texas 1976 Democratic National Convention Keynote Address Individuals Are The Power:
  33. 33. Thank you for you time!! @QuietStormnat LinkedIn: nevansharris Tea/Wine:
  34. 34. Backup Slides
  35. 35. There is no executive order; there is no law that can require the American people to form a national community. This we must do as individuals, and if we do it as individuals, there is no President of the United States who can veto that decision… we must define the "common good" and begin again to shape a common future. Barbara Charlie Jordan, Former House of Representative, Texas 1976 Democratic National Convention Keynote Address
  36. 36. Civic Data Infrastructure Today’s Landscape ● People are most important part of infrastructure (users, IT support, data scientists, etc) ● Industry is moving away from monolithic to modular enterprise systems Key Technical Factors - Security/privacy - Identity and Reconciliation - Transparency/Interoperability - Mapping data use to methodology
  37. 37. A Look into the Crystal Ball In Five Years: Staff skills should move up in the value chain, especially as the technologies evolve. Technology will facilitate people to provide higher value deliverables; how we integrate data will no longer be the question, but rather how we ensure that the analysis and models are affecting practice and driving change. Infrastructure should provide more than just integration support, it should provide transparency and documentation about how solutions were developed, including confidence levels for results. The field has come a long way in the past five years, particularly around transparency and consistency. In order to conduct multi-site inquiry and continue to improve data models, all sites must emphasize the development of metadata; not only as a data operation but to contribute to the field. The data science “gap” will diminish as schools incorporate analytic methods and skills into more undergraduate and professional graduate programs. An ongoing bottleneck will be legal and transactional issues such as privacy, domain use, ethics, international considerations, and inclusive engagement. Experts and practitioners will be thinking about blockchain, consent management and identity management in how we collect data on people. We will need to be mindful of artificial intelligence and machine learning to ensure that we are careful in training data sets while focusing on ethical use and data governance practices.
  38. 38. Consideration 0: Staffing Expertise Staff with substantive business expertise regarding the issues, programs, and context-specific social service areas is critical. Infrastructure and technology choices tightly integrated with the underlying work culture and business processes. Four key skill areas: Data Storage and System Administration: solution architects, security administrators, database administrators, etc. Data Integration: data / information architects, data modelers, data engineers, integration/interface engineers, etc. Data Analytics: research / evaluation specialists, data analysts, data scientists, business intelligence analysts, etc. Data Publication: website administrators, UX designers, BI portal administrators. Key Question: Are the technology solutions chosen on-premise, cloud-based, or a combination? Are existing staff able to support both on-premise or cloud-based systems, or are new staff needed?
  39. 39. Consideration 1: Data Management Evaluating data management needs requires: - Close examination of how to drive standardization upstream and downstream, - Plan for an agile and flexible data intake and normalization or indexing process Reduces: - dependence on specific vendors - ongoing maintenance and upgrade costs, Supports ability to link new data streams as they become relevant Key Questions: Does the solution support monitoring of data collection, cleaning, and integration activities? How does data quality feedback get passed to the data generators, and how does one track reporting deadlines and data completeness?
  40. 40. Consideration 2: Security and Privacy “How will the IDS protect the data?” - Individual privacy must be legally, technically, procedurally, and physically maintained throughout the process. - Technology leveraging anonymization techniques such as data masking, data aggregation, and data obfuscation. - Process for ensuring proper consent to use administrative data. - Strong encryption algorithm(s) are deployed (e.g. AES, DES, RSA, & ECC) and that the system administrator can manage the encryption keys which keep the data private. Key Question: How does the solution manage user accounts to ensure authorized access to the correct data, at the right times, for the right reasons? Is there a single sign-on solution? Are there role-based access controls (RBAC)?
  41. 41. Consideration 3: Data Collection An IDS must be dynamic and agile enough to support new requirements, changing schemas, and new data streams. Ways to upload data: - Manual - an individual uploads the data - Automatic - a connection between source and integration system - Hybrid - some manual uploads and some automated connections Key Question: What is the cost of establishing the connection and implementing technology necessary to automatically collect and clean the data?
  42. 42. Consideration 4: Data Storage How will the data stored - on-site, in a data center, or leverage the many cloud-based solutions Where will the data be stored - Warehousing solution is needed to properly store and make data accessible - Repository - The simplest warehousing options store the source data - Data lake - Due to the increasing complexity of data sources, many organizations have moved to data lakes, which are a system of repositories. Key Question: If using a cloud storage provider, is it up-to-date on data center and industry certifications such as HIPAA, FedRamp, PCI DSS, SSAE 16? What happens if there is a data breach?
  43. 43. Consideration 5: Data Linking Data linking is the process of integrating different data sources, based upon common business keys and other identifying information There are two key methods for data linkage: - Deterministic matching - considered more precise since it looks for exact matches in the content and format of datasets (i.e., identical SSN) - Probabilistic matching (also known as fuzzy matching) - looks for closeness in the data (i.e., identical SSN with or without dashes) and provides weighted scores for likelihood of matching. Key Question: What happens when a new data source is introduced and the matching rules need to change?
  44. 44. Consideration 6: Data Access and Dissemination Secure data access for analysts includes VPN remote access, limited and licensed data sets, and on-site access. Data dissemination is the finished product and comes in many forms, from reports, to machine-readable datasets, to dashboards and websites. - how the technology manages access to the finished product, - how it will be disseminated, and - how risks of redisclosure are minimized. Key Question: Who gets access to the data infrastructure, and by what means? Are there levels of access for identified and de-identified information, or for internal and external users?
  45. 45. How are summer job opportunities distributed in my city? Describe.
  46. 46. Which students are at risk of dropping out early? Predict.
  47. 47. What occupations are changing most rapidly in my area? Detect.
  48. 48. How are my training programs affecting future wages and employment? Evaluate.