This document discusses predictive coding 2.0 as an improved method for e-discovery. Predictive coding 1.0 had limitations in dealing with incomplete document collections that are continuously updated and changing coding calls. Predictive coding 2.0 utilizes a flexible analytics framework based on bipartite graphs that can dynamically assess documents and adapt to new information as the collection grows. This allows for continuous case assessment rather than being limited to analysis once the collection is complete. The document provides examples of how predictive coding 2.0 could be applied to different litigation scenarios.
MAIS Onlus - Atti del Convegno 2013 "Solidarietà ai tempi della crisi"MAIS Onlus
Il 20 e 21 aprile scorso si è svolto, a Roma, il convegno annuale dell'Associazione di Volontariato MAIS Onlus (Movimento per l’Autosviluppo Internazionale nella Solidarietà).
“Solidarietà ai tempi della crisi”: il titolo del convegno di quest’anno, un'occasione di confronto su un tema attuale per riflettere sulle modalità di affrontare la crisi in un momento così particolare per le organizzazioni no profit (ONP).
“Occasione di svolta e nuovo punto di partenza” la crisi nella parole di Loredana Rabellino, coordinatrice dei progetti internazionali, che sottolinea come “In Italia, alla crisi economica non è corrisposta una crisi della partecipazione che, anzi, è aumentata. Siamo chiamati ad una riscoperta della vera natura di associazione di volontariato, accogliendo e rispondendo alla richiesta di sempre maggiore partecipazione diretta, nelle attività in Italia come nei progetti di sostegno a distanza sparsi nel mondo”.
Ed è questa ampia partecipazione dei sostenitori a rendere possibili nuove iniziative di Fund Raising come le giornate di "Visita l'arte": mostre, incontri, passeggiate nel segno della solidarietà, occasioni di coniugare la riscoperta delle bellezze culturali italiane in favore di una solidarietà internazionale.
“Organizzativamente - spiega Pasquale Castaldo, volontario MAIS Onlus che sta seguendo la riorganizzazione dei processi - la diminuzione del numero di sostenitori ha trovato come contraltare una maggiore efficienza degli strumenti utilizzati e un'ottimizzazione delle risorse economiche”
I tagli necessari non hanno dunque alterato in maniera significativa la qualità delle risposte messe in campo e la crisi ha reso possibile sperimentare nuove forme associative e altri modi di sostenere a distanza come i progetti di sostegno ad una classe, in sostituzione del sostegno al singolo studente che sono stati attuati con successo in Brasile, rivelandosi un efficace strumento di contenimento dei costi.
In Madagascar invece la Fattoria Biologica nata in seno al progetto decennale di Sostegno a Distanza, ha dimostrato tutto il potenziale di superamento delle condizioni di povertà, sperimentando con successo l'associazione tra contadini in una cooperativa agricola.
Il percorso di riorganizzazione interna dell'ufficio e di rielaborazione della strategia comunicativa dell'associazione sta facendo il resto rafforzando il dialogo continuo con i propri sostenitori anche grazie all’utilizzo dei social network e di una comunicazione integrata che possono permettere ad una associazione che opera da venticinque anni in Italia nell’ambito del SAD di rivolgersi a profili nuovi come quello del donatore online o al donatore orientato alla realizzazione di un singolo obiettivo.
MAIS Onlus - Atti del Convegno 2013 "Solidarietà ai tempi della crisi"MAIS Onlus
Il 20 e 21 aprile scorso si è svolto, a Roma, il convegno annuale dell'Associazione di Volontariato MAIS Onlus (Movimento per l’Autosviluppo Internazionale nella Solidarietà).
“Solidarietà ai tempi della crisi”: il titolo del convegno di quest’anno, un'occasione di confronto su un tema attuale per riflettere sulle modalità di affrontare la crisi in un momento così particolare per le organizzazioni no profit (ONP).
“Occasione di svolta e nuovo punto di partenza” la crisi nella parole di Loredana Rabellino, coordinatrice dei progetti internazionali, che sottolinea come “In Italia, alla crisi economica non è corrisposta una crisi della partecipazione che, anzi, è aumentata. Siamo chiamati ad una riscoperta della vera natura di associazione di volontariato, accogliendo e rispondendo alla richiesta di sempre maggiore partecipazione diretta, nelle attività in Italia come nei progetti di sostegno a distanza sparsi nel mondo”.
Ed è questa ampia partecipazione dei sostenitori a rendere possibili nuove iniziative di Fund Raising come le giornate di "Visita l'arte": mostre, incontri, passeggiate nel segno della solidarietà, occasioni di coniugare la riscoperta delle bellezze culturali italiane in favore di una solidarietà internazionale.
“Organizzativamente - spiega Pasquale Castaldo, volontario MAIS Onlus che sta seguendo la riorganizzazione dei processi - la diminuzione del numero di sostenitori ha trovato come contraltare una maggiore efficienza degli strumenti utilizzati e un'ottimizzazione delle risorse economiche”
I tagli necessari non hanno dunque alterato in maniera significativa la qualità delle risposte messe in campo e la crisi ha reso possibile sperimentare nuove forme associative e altri modi di sostenere a distanza come i progetti di sostegno ad una classe, in sostituzione del sostegno al singolo studente che sono stati attuati con successo in Brasile, rivelandosi un efficace strumento di contenimento dei costi.
In Madagascar invece la Fattoria Biologica nata in seno al progetto decennale di Sostegno a Distanza, ha dimostrato tutto il potenziale di superamento delle condizioni di povertà, sperimentando con successo l'associazione tra contadini in una cooperativa agricola.
Il percorso di riorganizzazione interna dell'ufficio e di rielaborazione della strategia comunicativa dell'associazione sta facendo il resto rafforzando il dialogo continuo con i propri sostenitori anche grazie all’utilizzo dei social network e di una comunicazione integrata che possono permettere ad una associazione che opera da venticinque anni in Italia nell’ambito del SAD di rivolgersi a profili nuovi come quello del donatore online o al donatore orientato alla realizzazione di un singolo obiettivo.
How much time it takes for my feature to arrive?Daniel Alencar
How much time it takes for a bug fix or a new feature be available to users? We did an empirical work to better understand what makes a new feature or bug fix to arrive faster to users
Technical debt is often characterized as design or code tradeoffs. In this talk I discuss how shortcuts in requirements analysis might lead to technical debt as well.
We'll discover the reasons why it is a risky bet to not *aim* to manage infrastructure and its configuration with idempotence and immutability at heart.
Sharing real world experience, we'll see why configurations should not be done by humans (it's like playing Djenga), and why what may work at the beginning does not work over a long period of time or scale (pet vs cattle problem).
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...ICSM 2011
Paper: Relating Developers' Concepts and Artefact Vocabulary in a Financial
Software Module
Authors: Tezcan Dilshener and Michel Wermelinger
Session: Industry Track 2 - Reverse Engineering
Use SAS to identify what tell-tale signs in consumers’ credit history would best model the bad consumers, and, in turn, use this as a way to prevent potential future bad consumers from getting approved for lines of credit.
AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...Amazon Web Services
"Cloud" computing provides significant advantages and enormous cost savings by allowing IT infrastructure to be provisioned as a ubiquitous, metered, unit priced and on demand service. However, the other major resourcing issue faced by CIO’s is the provision of skilled labour to develop, support and maintain a increasing wide range of IT applications.
This session will show attendees how the worldwide pool of freelance developers, the "Crowd", can be utilised as a ubiquitous, metered, unit priced and on demand resource pool to work in the "Cloud" to improve responsiveness to customer demands, reduce development timeframes and achieve significant cost savings.
Although the crowd can bring enormous benefits in terms of cost and agility, there are some technical and business barriers to adoption in large organisations. This presentation will discuss the barriers and, using some real examples, will explain how GoSource overcomes them.
Build systems orchestrate how human-readable source code is translated into executable programs. In a software project, source code changes can induce changes in the build system (aka. build co-changes). It is difficult for developers to identify when build co-changes are necessary due to the complexity of build systems. Prediction of build co-changes works well if there is a sufficient amount of training data to build a model. However, in practice, for new projects, there exists a limited number of changes. Using training data from other projects to predict the build co-changes in a new project can help improve the performance of the build co-change prediction. We refer to this problem as cross-project build co-change prediction.
In this paper, we propose CroBuild, a novel cross-project build co-change prediction approach that iteratively learns new classifiers. CroBuild constructs an ensemble of classifiers by iteratively building classifiers and assigning them weights according to its prediction error rate. Given that only a small proportion of code changes are build co-changing, we also propose an imbalance-aware approach that learns a threshold boundary between those code changes that are build co-changing and those that are not in order to construct classifiers in each iteration. To examine the benefits of CroBuild, we perform experiments on 4 large datasets including Mozilla, Eclipse-core, Lucene, and Jazz, comprising a total of 50,884 changes. On average, across the 4 datasets, CroBuild achieves a F1-score of up to 0.408. We also compare CroBuild with other approaches such as a basic model, AdaBoost proposed by Freund et al., and TrAdaBoost proposed by Dai et al.. On average, across the 4 datasets, the CroBuild approach yields an improvement in F1-scores of 41.54%, 36.63%, and 36.97% over the basic model, AdaBoost, and TrAdaBoost, respectively.
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...gdgsurrey
Dive into the essentials of ML model development, processes, and techniques to combat underfitting and overfitting, explore distributed training approaches, and understand model explainability. Enhance your skills with practical insights from a seasoned expert.
ACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docxSALU18
ACCOUNTING INFORMATION SYSTEMS
Access and Data Analytics Test
General Instructions.
This exam has four parts. Part 1 is in class. Parts 2, 3, and 4 are take-home. Submit all parts to the
designated dropbox folder. I expect your individual effort on all parts. Parts 2 to 4 are described in a
separate document.
Part 1 – Access (50 points).
To get full credit, you must set up appropriate relationships among the tables and enforce referential
integrity for each link. Your queries must produce the correct values, the fields must by labeled and
formatted appropriately, and query designs must not include extraneous tables. In other words, you
should follow the list of fundamental rules for Access posted on BeachBoard and included at the end of
this document for reference.
1. Download the Fall_2019 database posted in the Access and Data Analytics Test Module under
CONTENT on BeachBoard.
2. Ensure that primary keys are set and establish appropriate relationships among the tables:
Stores, Vendors, Purchases, and Purchase_Items. Stores and Vendors should be linked to
Purchases. Purchases should be linked to Purchase_Items.
3. Prepare the following queries, naming the queries qa, qb, qc, qd, corresponding to the
identifying letters below:
a. Use the purchase_items table to calculate the dollar amount of each item purchased in
an extension query; name your new calculated field purchase_item_amount and format it
appropriately.
b. Use qa and the purchases table to sum the purchase item amounts for each purchase in
an accumulation query; include all fields from the purchases table and the
purchase_item_amount field from qa; name your summed field purchase amount and
format it appropriately.
c. Use qb and the vendors table to sum the purchase amounts from each vendor in
another accumulation query; include vendor number, name, city, and state; name your
summed field vendor purchases and format it appropriately.
d. Use the qb query. Keeping all fields from qb, calculate the month of the purchase;
name that field purchase month.
BEFORE SUBMITTING, ask me to review your work. After I say that you are done, then submit your file
to the BeachBoard DROPBOX. Be sure to close Access before you upload your results.
1
Some Fundamental Rules for Access
1. Look at your tables and think about what information those tables provide before you start
linking tables and creating queries.
2. Make sure each table has a primary key designated.
3. Always establish relationships between tables first, before starting queries.
4. Always enforce referential integrity (or understand why you can’t).
5. No “expr1” field names.
6. Do not click on the big sigma to produce totals if the query doesn’t require totals (i.e., an
extension query).
7. Avoid “SumOf…” field names in accumulation queries.
8. Include identifying information in addition to the primary key in accumulation queries that
provide subtotals.
9. Always format new fields prope.
Fine-tuning Large Language Models by Dmitry BalabkaDevClub_lv
focusing on the hands-on process of preparing datasets and fine-tuning models for a specific business task. This session will cover dataset preparation, model fine-tuning, and cloud ML accelerators like TPUs and related libraries. It’s aimed at those seeking hands-on knowledge in applying ML techniques.
How much time it takes for my feature to arrive?Daniel Alencar
How much time it takes for a bug fix or a new feature be available to users? We did an empirical work to better understand what makes a new feature or bug fix to arrive faster to users
Technical debt is often characterized as design or code tradeoffs. In this talk I discuss how shortcuts in requirements analysis might lead to technical debt as well.
We'll discover the reasons why it is a risky bet to not *aim* to manage infrastructure and its configuration with idempotence and immutability at heart.
Sharing real world experience, we'll see why configurations should not be done by humans (it's like playing Djenga), and why what may work at the beginning does not work over a long period of time or scale (pet vs cattle problem).
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...ICSM 2011
Paper: Relating Developers' Concepts and Artefact Vocabulary in a Financial
Software Module
Authors: Tezcan Dilshener and Michel Wermelinger
Session: Industry Track 2 - Reverse Engineering
Use SAS to identify what tell-tale signs in consumers’ credit history would best model the bad consumers, and, in turn, use this as a way to prevent potential future bad consumers from getting approved for lines of credit.
AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...Amazon Web Services
"Cloud" computing provides significant advantages and enormous cost savings by allowing IT infrastructure to be provisioned as a ubiquitous, metered, unit priced and on demand service. However, the other major resourcing issue faced by CIO’s is the provision of skilled labour to develop, support and maintain a increasing wide range of IT applications.
This session will show attendees how the worldwide pool of freelance developers, the "Crowd", can be utilised as a ubiquitous, metered, unit priced and on demand resource pool to work in the "Cloud" to improve responsiveness to customer demands, reduce development timeframes and achieve significant cost savings.
Although the crowd can bring enormous benefits in terms of cost and agility, there are some technical and business barriers to adoption in large organisations. This presentation will discuss the barriers and, using some real examples, will explain how GoSource overcomes them.
Build systems orchestrate how human-readable source code is translated into executable programs. In a software project, source code changes can induce changes in the build system (aka. build co-changes). It is difficult for developers to identify when build co-changes are necessary due to the complexity of build systems. Prediction of build co-changes works well if there is a sufficient amount of training data to build a model. However, in practice, for new projects, there exists a limited number of changes. Using training data from other projects to predict the build co-changes in a new project can help improve the performance of the build co-change prediction. We refer to this problem as cross-project build co-change prediction.
In this paper, we propose CroBuild, a novel cross-project build co-change prediction approach that iteratively learns new classifiers. CroBuild constructs an ensemble of classifiers by iteratively building classifiers and assigning them weights according to its prediction error rate. Given that only a small proportion of code changes are build co-changing, we also propose an imbalance-aware approach that learns a threshold boundary between those code changes that are build co-changing and those that are not in order to construct classifiers in each iteration. To examine the benefits of CroBuild, we perform experiments on 4 large datasets including Mozilla, Eclipse-core, Lucene, and Jazz, comprising a total of 50,884 changes. On average, across the 4 datasets, CroBuild achieves a F1-score of up to 0.408. We also compare CroBuild with other approaches such as a basic model, AdaBoost proposed by Freund et al., and TrAdaBoost proposed by Dai et al.. On average, across the 4 datasets, the CroBuild approach yields an improvement in F1-scores of 41.54%, 36.63%, and 36.97% over the basic model, AdaBoost, and TrAdaBoost, respectively.
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...gdgsurrey
Dive into the essentials of ML model development, processes, and techniques to combat underfitting and overfitting, explore distributed training approaches, and understand model explainability. Enhance your skills with practical insights from a seasoned expert.
ACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docxSALU18
ACCOUNTING INFORMATION SYSTEMS
Access and Data Analytics Test
General Instructions.
This exam has four parts. Part 1 is in class. Parts 2, 3, and 4 are take-home. Submit all parts to the
designated dropbox folder. I expect your individual effort on all parts. Parts 2 to 4 are described in a
separate document.
Part 1 – Access (50 points).
To get full credit, you must set up appropriate relationships among the tables and enforce referential
integrity for each link. Your queries must produce the correct values, the fields must by labeled and
formatted appropriately, and query designs must not include extraneous tables. In other words, you
should follow the list of fundamental rules for Access posted on BeachBoard and included at the end of
this document for reference.
1. Download the Fall_2019 database posted in the Access and Data Analytics Test Module under
CONTENT on BeachBoard.
2. Ensure that primary keys are set and establish appropriate relationships among the tables:
Stores, Vendors, Purchases, and Purchase_Items. Stores and Vendors should be linked to
Purchases. Purchases should be linked to Purchase_Items.
3. Prepare the following queries, naming the queries qa, qb, qc, qd, corresponding to the
identifying letters below:
a. Use the purchase_items table to calculate the dollar amount of each item purchased in
an extension query; name your new calculated field purchase_item_amount and format it
appropriately.
b. Use qa and the purchases table to sum the purchase item amounts for each purchase in
an accumulation query; include all fields from the purchases table and the
purchase_item_amount field from qa; name your summed field purchase amount and
format it appropriately.
c. Use qb and the vendors table to sum the purchase amounts from each vendor in
another accumulation query; include vendor number, name, city, and state; name your
summed field vendor purchases and format it appropriately.
d. Use the qb query. Keeping all fields from qb, calculate the month of the purchase;
name that field purchase month.
BEFORE SUBMITTING, ask me to review your work. After I say that you are done, then submit your file
to the BeachBoard DROPBOX. Be sure to close Access before you upload your results.
1
Some Fundamental Rules for Access
1. Look at your tables and think about what information those tables provide before you start
linking tables and creating queries.
2. Make sure each table has a primary key designated.
3. Always establish relationships between tables first, before starting queries.
4. Always enforce referential integrity (or understand why you can’t).
5. No “expr1” field names.
6. Do not click on the big sigma to produce totals if the query doesn’t require totals (i.e., an
extension query).
7. Avoid “SumOf…” field names in accumulation queries.
8. Include identifying information in addition to the primary key in accumulation queries that
provide subtotals.
9. Always format new fields prope.
Fine-tuning Large Language Models by Dmitry BalabkaDevClub_lv
focusing on the hands-on process of preparing datasets and fine-tuning models for a specific business task. This session will cover dataset preparation, model fine-tuning, and cloud ML accelerators like TPUs and related libraries. It’s aimed at those seeking hands-on knowledge in applying ML techniques.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
1. Predictive Coding 2.0
Making E-Discovery
More Efficient and Cost Effective
John Tredennick
Jeremy Pickens
Jim Eidelman
2. How Many Do I Have to Check?
1. You have a bag with 1 million M&Ms
2. It contains mostly brown M&Ms
3. You cannot see into the bag
4. You have a scoop that will pull out 100
M&Ms at a time
5. Your hope is that there are no red
M&Ms in the bag
6. You pull out a scoop and they are all
brown
How many scoops do you need to review
to be confident there are no red M&Ms?
3. Let’s Take a Poll
How many scoops?
2
1 3
5 10 20
100? 500?
1,000?
4. How Confident Do You Need to Be?
Does 95% work? How about 99%
How many errors can you tolerate?
§ Five out of a hundred?
§ One out of a hundred?
§ One percent = 10,000
At a 95% confidence level and 5% percent margin of error: 384 M&Ms
At a 99% confidence level and 1% margin of error: 459 M&Ms
At a 100% confidence level and 0% margin of error: 1,000,000 M&Ms
8. What Have the Courts Said?
“Until there is a judicial opinion approving (or
even critiquing) the use of predictive coding,
counsel will just have to rely on this article as a
sign of judicial approval. In my opinion,
computer-assisted coding should be used in
those cases where it will help ‘secure the just,
speedy, and inexpensive’ (Fed. R. Civ. P. 1)
determination of cases in our e-discovery
world.”
Magistrate Judge Andrew Peck
9. Predictive Coding 1.0
1. Assemble your corpus.
2. Assemble a seed set of
documents.
3. Review the seed set.
4. Apply machine learning and
automatically tag the remainder
of the corpus.
10. Predictive Coding 1.0
§ Tremendous gains in review
effectiveness
§ Substantial cost savings
§ It works. Often quite well
….when the corpus is complete.
13. In which upload and on which day do your responsive
documents show up?
67 166
uploads days
Terms that do not appear early begin appearing later.
14. Machine-Assisted Decision Making
Upload timeline of 6 TB case.
When should machine-assisted
Is it here? decision making (e.g. early case
assessment) begin?
Or here?
15. Example: Responsive Early, Junk Later
To: bob@company.com, alice@company.com
From: charles@company.com
Subject: Company Picnic
Bob, would you coordinate with Alice and make sure we have
enough hamburger buns for the company picnic? Please try
and find them at a reasonable price.
Responsive Junk
16. Example: Junk Early, Responsive Later
To: bob@company.com, alice@privatemail.com
From: charles@company.com
Subject: Get Together
Let’s get together at 7pm at the Sports Bar to discuss pricing of
our components. The Broncos are playing and I really want to
watch Tebow.
Junk
Responsive
17. Problems With Predictive Coding 1.0
The corpus is almost never complete
§ Continuous collection and rolling uploads
§ When does “Early Case Assessment” begin?
Changing Issues
§ Responsiveness is “bursty”
Shifting Concept Relationships
§ Due both to increasing corpus and changing issues
§ Exploration is extremely limited
18. Our Approach
Predictive Coding 2.0 necessitates the ability to deal with
dynamic change and flux.
We have developed a flexible analytics framework based
on bipartite graphs
It is aware of changes in corpus and in coding so as to
enable smart review and adaptive related concept
suggestion as information pours in.
19. Our Approach
Avoid the lock-in that arises due to poor decision making that
occurs early in the matter when corpus (collection) and coding
information is incomplete.
Goal:
Continuous Case Assessment
20. What Is Underneath?
A full bipartite graph of the
documents and features (e.g.
words, phrases, dates) that
comprise those documents
22. Feedback: Immediate and Continuous
Continuous feedback aids better decision
making and predictive coding.
Adapts to both:
New arrival of coding information
New arrival of documents and terms
24. Predictive Coding 2.0
Feedback – and
improvement – is iterative,
continuous, amplified.
The more you review, the
less you have to review
% of Docs Examined Manually
25. Better Decisions As Understanding Improves
Term relationships change over time
Using continuous improvement,
decisions can be revised and refined
as the matter proceeds.
26. Terms Documents
Time
uncovers
new
relationships
27. Looking at Concepts Over Time
20%
65%
Start with the lube
fuels
key term piping
fob
battery
purityethane
“fuel” mounted
petrochemicals
redundant
fin
batteries
paraxylene
At 20% compartments
cif
mixture
phy
these are airflow
fwd
the related ansi
swopt
ventilation
brentpartials
terms chargers
brg
stainless
locswap
rotor
benzene
And at 65% bleed
diff
accessory
spd
plenum
liquids
detector
opt
30. Putting Related Concepts to Work
The whole corpus
Topic 203
…whether the Company had met,
or could, would, or might meet its
financial forecasts, models,
projections, or plans…
Topic 205
…analyses, evaluations,
TREC collection projections, plans, and reports on
with many topics the volume(s) or geographic
identified location(s) of energy loads.
31. Model In the Whole Collection
Term
Score
Look at the
keyword “model” modeling
1000
equation
864
Scope is the stochastic
706
whole collection variables
677
parameters
518
probability
365
simulation
337
assumption
325
returns
251
curves
211
32. Model In Topic 203
Term
Score
Look at the
keyword “model” flows
1000
assumptions
913
Scope: Topic 203 gains
872
shares
864
meeting liquidity
486
financial fluctuations
374
forecasts
analysts
285
cents
254
whitewing
237
handles
166
33. Model In Topic 205
Term
Score
Look at the
keyword “model” bids
1000
congestion
611
Scope: Topic 205 loads
455
constraints
354
analyzing clearing
292
energy zonal
194
volumes
signals
192
procure
190
dispatch
152
csc
120
34. Model In Comparison
Now, Whole Corpus
Topic 203
Topic
205
imagine this modeling
flows
bids
with batches equation
assumptions
congestion
and coding stochastic
gains
loads
changes variables
shares
constraints
over time! parameters
liquidity
clearing
Note: Our system probability
fluctuations
zonal
can accept any simulation
analysis
signal
combination of
coding and assumption
cents
procure
metadata filters
to dynamically
returns
whitewing
dispatch
assess your data curves
handles
csc
36. Predictive Coding 2.0
Problem: The corpus is almost never complete
Answer: Review Algorithms that are iterative and continuous
Problem: Changing Issues
Answer: Review Algorithms that are adaptive and continuous
Problem: Shifting Concept Relationships
Answer: Concept Relationships that are calculated dynamically, on-
the-fly, and coding-aware.
Continuous Case Assessment
37. Analytics Consulting
§ Analytics consulting and predictive ranking for nearly 4 years
§ How it started -- Before “Predictive Coding” became popular:
“Can’t you predict what documents are probably
relevant based on your review so far?”
– Judge, SDNY
§ Predictive Ranking: Iterative search techniques + algorithms
§ Then off-the-shelf Predictive Coding 1.0 technologies
§ Catalyst’s research is exciting! We apply the research to real-world
scenarios. Applying Bipartite Analytics…
38. Smart Review with the Bipartite Analytics
Technology Advantages:
§ Accurate
§ Dynamic
§ Flexible
§ “Just in Time” suggestions
39. Smart Review Scenarios
1. “What happened” – examples: FCPA investigation, conspiracy ECA
2. Typical large scale litigation with lots of ESI –
e.g., class action lawsuit
3. Highly complex litigation with multiple issues –
e.g. patent and unfair competition claims
40. Scenario 1 – What happened?
Goal: Rapidly determine facts and resolve matter if possible
Applying the Technology
Small number of knowledgeable attorneys drill into documents using the
fusion of advanced search features and flexible predictive coding.
41.
42.
43.
44. Scenario 1 – What happened?
Goal: Rapidly determine facts and resolve matter if possible
Applying the Technology
Small number of knowledgeable attorneys drill into documents using
the fusion of advanced search features and flexible predictive coding.
§ Faster location of valuable “veins” of information
due to search filters
§ Rapid learning and application of that learning
through flexible, “just in time” predictive coding 2.0.
§ “Choose your own adventure”
45. Scenario 2 – Large Scale Litigation
Goal: Minimize cost because of learning across large document set,
increase quality with focused review, and maximize protection of
privilege and trade secrets
Applying the Technology:
§ Prioritized review based on rapid, continuous learning
§ Large scale defensible culling
§ More accurate ranking of “potentially privileged” documents
46. Scenario 3– Highly Complex Litigation
Goal: Review and produce with multiple and changing issues
Applying the Technology
§ Rapid learning across multiple topics
§ Leverage ability to adjust for change in topics
§ Review quality improves because of focus
§ Explore otherwise hidden subjects with Concept Explorer
§ Leverage learning across narrow, focused lines of inquiry (e.g.,
emails between two people in a narrow time window)
§ Protect privileged documents
47. Predictive Coding 2.0
Making E-Discovery
More Efficient and Cost Effective
John Tredennick
Jeremy Pickens
Jim Eidelman