This document summarizes a presentation about building flexible tools in Python to store and report on CSV data. The presentation covers using Python data structures like lists, sets, tuples, dictionaries and the CSV reader to analyze CSV files. It demonstrates counting the number of each value in each column using collections.Counter and indexing into counters using tuples. The presentation concludes with examples of building a class to recursively print indented summaries of the data at different levels.
PyGotham 09:45 AM - 10:45 AM on August 17, 2014.
If you're new to Python, you might find that you're using Python as if it were C. This talk will demonstrate how to take advantage of Python's special data structures to build tools for analyzing and creating nicely-formatted reports from CSV data.
Data Science. .Net/C# Monte Carlo modeling. The R Programming language. See it all come together in one place in this talk. Presentation date 6/13 at Lake County .NET User Group.
Screening data is still a laborious task in R. Calculating summary statistics for all variables while listing the occurrence of missing data and producing some kind of graphics is a three-click process in SPSS, but base R does not contain higher level functions for quickly describing bigger datasets in a more or less automated way. The R package DescTools addresses three problem areas. First it provides functions meant to facilitate the construction of univariate and bivariate descriptive tables of several variable types. Then the connectivity between R and MS-Office is enhanced by providing an easy interface to Word and Excel. Generating reports directly in Word and importing data directly from Excel becomes an easy task. Finally a considerable amount of base functions (operators, string and date functions, statistics, tests, several plot types) not present in base R is collected from other packages and internet sources with the goal to have them consolidated in ONE instead of dozens of packages and to have a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned.
PyGotham 09:45 AM - 10:45 AM on August 17, 2014.
If you're new to Python, you might find that you're using Python as if it were C. This talk will demonstrate how to take advantage of Python's special data structures to build tools for analyzing and creating nicely-formatted reports from CSV data.
Data Science. .Net/C# Monte Carlo modeling. The R Programming language. See it all come together in one place in this talk. Presentation date 6/13 at Lake County .NET User Group.
Screening data is still a laborious task in R. Calculating summary statistics for all variables while listing the occurrence of missing data and producing some kind of graphics is a three-click process in SPSS, but base R does not contain higher level functions for quickly describing bigger datasets in a more or less automated way. The R package DescTools addresses three problem areas. First it provides functions meant to facilitate the construction of univariate and bivariate descriptive tables of several variable types. Then the connectivity between R and MS-Office is enhanced by providing an easy interface to Word and Excel. Generating reports directly in Word and importing data directly from Excel becomes an easy task. Finally a considerable amount of base functions (operators, string and date functions, statistics, tests, several plot types) not present in base R is collected from other packages and internet sources with the goal to have them consolidated in ONE instead of dozens of packages and to have a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned.
From usability to performance, analytics to architecture; as report developers, the user experience design (UX) of your data model is quickly becoming more important than the pretty pictures that sit on top of it. This session will concentrate on the design decisions needed to increase the usage of your reports.
Data Manipulation with Numpy and Pandas in PythonStarting with NOllieShoresna
Data Manipulation with Numpy and Pandas in Python
Starting with Numpy
#load the library and check its version, just to make sure we aren't using an older version
import numpy as np
np.__version__
'1.12.1'
#create a list comprising numbers from 0 to 9
L = list(range(10))
#converting integers to string - this style of handling lists is known as list comprehension.
#List comprehension offers a versatile way to handle list manipulations tasks easily. We'll learn about them in future tutorials. Here's an example.
[str(c) for c in L]
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
[type(item) for item in L]
[int, int, int, int, int, int, int, int, int, int]
Creating Arrays
Numpy arrays are homogeneous in nature, i.e., they comprise one data type (integer, float, double, etc.) unlike lists.
#creating arrays
np.zeros(10, dtype='int')
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
#creating a 3 row x 5 column matrix
np.ones((3,5), dtype=float)
array([[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
#creating a matrix with a predefined value
np.full((3,5),1.23)
array([[ 1.23, 1.23, 1.23, 1.23, 1.23],
[ 1.23, 1.23, 1.23, 1.23, 1.23],
[ 1.23, 1.23, 1.23, 1.23, 1.23]])
#create an array with a set sequence
np.arange(0, 20, 2)
array([0, 2, 4, 6, 8,10,12,14,16,18])
#create an array of even space between the given range of values
np.linspace(0, 1, 5)
array([ 0., 0.25, 0.5 , 0.75, 1.])
#create a 3x3 array with mean 0 and standard deviation 1 in a given dimension
np.random.normal(0, 1, (3,3))
array([[ 0.72432142, -0.90024075, 0.27363808],
[ 0.88426129, 1.45096856, -1.03547109],
[-0.42930994, -1.02284441, -1.59753603]])
#create an identity matrix
np.eye(3)
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
#set a random seed
np.random.seed(0)
x1 = np.random.randint(10, size=6) #one dimension
x2 = np.random.randint(10, size=(3,4)) #two dimension
x3 = np.random.randint(10, size=(3,4,5)) #three dimension
print("x3 ndim:", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)
('x3 ndim:', 3)
('x3 shape:', (3, 4, 5))
('x3 size: ', 60)
Array Indexing
The important thing to remember is that indexing in python starts at zero.
x1 = np.array([4, 3, 4, 4, 8, 4])
x1
array([4, 3, 4, 4, 8, 4])
#assess value to index zero
x1[0]
4
#assess fifth value
x1[4]
8
#get the last value
x1[-1]
4
#get the second last value
x1[-2]
8
#in a multidimensional array, we need to specify row and column index
x2
array([[3, 7, 5, 5],
[0, 1, 5, 9],
[3, 0, 5, 0]])
#1st row and 2nd column value
x2[2,3]
0
#3rd row and last value from the 3rd column
x2[2,-1]
0
#replace value at 0,0 index
x2[0,0] = 12
x2
array([[12, 7, 5, 5],
[ 0, 1, 5, 9],
[ 3, 0, 5, 0]])
Array Slicing
Now, we'll learn to access multiple or a range of elements from an array.
x = np.arange(10)
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
#from start to 4th position
x[: ...
TECHWEEKENDS Presents;
De-cluttering Machine Learning in collaboration with IEEE GGSIPU
Are you clueless when you hear people saying words like Unsupervised Learning and Regression? Worry not❗, GDSC USICT is there for you!!
We are organizing a session on Machine Learning where you will Learn the basics of Machine Learning while developing a Hands-On Project from Scratch and seeing the results in real time. You will also learn different algorithms and models and various Data Preparation Techniques.
OverviewThis hands-on lab allows you to follow and experiment w.docxgerardkortney
Overview:
This hands-on lab allows you to follow and experiment with the critical steps of developing a program including the program description, Analysis, , Design(program design, pseudocode), Test Plan, and implementation with C code. The example provided uses sequential, repetition statements and nested repetition statements.
Program Description:
This program will calculate the average of 10 positive integers. The program will ask the user to 10 integers. If any of the values entered is negative, a message will be displayed asking the user to enter a value greater than 0. The program will use a loop to input the data.
Analysis:
I will use sequential, selection and repetition programming statements.
The program will loop for 10 positive numbers, prompting the user to enter a number.
I will define three integer variables: count, value and sum. count will store how many times values greater than 0 are entered. value will store the input. Sum will store the sum of all 10 integers.
I will define one double number: avg. avg will store the average of the ten positive integers input.
The sum will be calculated by this formula: sum = sum + value For example, if the first value entered was 4 and second was 10: sum = sum + value = 0 + 4
sum = 4 + 10 = 14
Values and sum can be input and calculated within a repetition loop: while count <10
Input value
sum = sum + value End while
Avg can be calculated by: avg = value/count
A selection statement can be used inside the loop to make sure the input value is positive.
If value >= 0 then count = count + 1 sum = sum + value
Else
input value End If
(
7
)
Program Design:
Main
// This program will calculate the average of 10 integer numbers
// Declare variables
// Initialize variables
// Loop through 10 numbers
// Prompt for positive integer
// Get input
// test input value for gt 0 if (value > 0)
//Increment counter
//Accumulate sum Else
// display msg to enter a positive integer
// Prompt for positive integer
// Get input Endif
// End loop
//Calculate average
//Print the results (average)
End
Test Plan:
To verify this program is working properly the input values could be used for testing:
Test Case
Input
Expected Output
1
1 1 1 0 1
2 0 1 3 2
Average = 1.2
2
100 100 100 100 -100
Input a positive value
100 200 -200 200 200
Input a positive value
200 200
average is 120.0
NOTE: test #2 has 12 input numbers because there are two negative numbers.
Pseudocode: Main
// This program will calculate the average of 10 positive integers.
// Declare variables
Declare count, value, sum as Integer Declare avg as double
//Initialize values
Set count=0 Set sum = 0 Set avg = 0.0;
// Loop through 10 integers While count < 10
Input value
If (value >=0)
sum = sum + value count=count+1
Else
Pr *** Value must be positive *** Input value
End if End While
// Calculate average avg = sum/count
// Print results
End //End of Main
C Code
The following is the C Code that will compile in execute in the online.
A quick review and demonstration on how to get started on parallel computing with R. Includes an example of SNOW cluster set up in the departmental lab.
Assignment #9First, we recall some definitions that will be help.docxfredharris32
Assignment #9
First, we recall some definitions that will be helpful in answering questions 1-3
A population parameter is a single value that describes a population characteristic (such as center, spread, location etc.)
EXAMPLES:
· The proportion
p
of adults in the United States who worry about money
· The mean lifetime
m
of a certain brand of computer hard disks
· The lower quartile
1
q
of a population of incomes
· The standard deviation
s
of the nicotine content per cigarette produced by a certain manufacture.
In real life, population parameters are usually unknown. An important objective of statistical inference is to use information obtained from random sample or samples (depending on the design of the study) to estimate parameters and to test claims made about them.
A statistic is a number computed form the sample data only. The resulting sample value must be independent of the population parameters.
Statistics are used as numerical estimates of population parameters.
Example: A random sample of 1500 national adults shows that 33% of Americans worry about money. The margin of error is +/- 3 percentage points.
Statistics have variation. Different random samples of size
n
from the same population will usually yield different values of the same statistic. This is called sampling variability.
The sampling distribution of a statistic is the distribution of the values taken by the statistic over all possible random samples of the same size from a given population.
What do we look for in a sampling distrinution?
Bias: A statistic is unbiased if its sampling distribution has a mean that is equal to the true value of the parameter being estimated by that statistic.
Variability: How much variation is there in the sampling distribution?
The goal of this assignment is to simulate the sampling distribution of some statistics.
Question 1:
An urn contains 50 beads. The beads are identical in shape and have one of two colors: blue and orange. We would like to estimate the proportion
p
of blue beads. We select without replacement a sample of 10 beads. The relevant statistics is the sample proportion
p
ˆ
of blue beads (i.e., the number of blue beads in the sample divided by 10.
For the purpose of the simulation exercise, we will assume that the box contains exactly 15 blue beads or, equivalently, the proportion of blue beads is
30
.
0
=
p
.
i) Select 100 samples of size 10 from the box.
ii) Compute the sample proportion
p
ˆ
of blue beads for each of the 100 samples found in (i).
iii) Make a histogram of the values of
p
ˆ
found in (ii) (that is the approximate sampling distribution of
p
ˆ
.)
iv) Find the summary statistics of the 100 values of
p
ˆ
.
v) Base yourself on the histogram and the summary statistics to describe the approximate sampling distribution of
p
ˆ
.
vi) Is
p
ˆ
an unbiased estimator of
p
? Hint: Evaluate the difference between
30
.
0
=
p
and the mean value. ...
NoSQL - MongoDB. Agility, scalability, performance. I am going to talk about the basis of NoSQL and MongoDB. Why some projects requires RDBMs and another NoSQL databases? What are the pros and cons to use NoSQL vs. SQL? How data are stored and transefed in MongoDB? What query language is used? How MongoDB supports high availability and automatic failover with the help of the replication? What is sharding and how it helps to support scalability?. The newest level of the concurrency - collection-level and document-level.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
From usability to performance, analytics to architecture; as report developers, the user experience design (UX) of your data model is quickly becoming more important than the pretty pictures that sit on top of it. This session will concentrate on the design decisions needed to increase the usage of your reports.
Data Manipulation with Numpy and Pandas in PythonStarting with NOllieShoresna
Data Manipulation with Numpy and Pandas in Python
Starting with Numpy
#load the library and check its version, just to make sure we aren't using an older version
import numpy as np
np.__version__
'1.12.1'
#create a list comprising numbers from 0 to 9
L = list(range(10))
#converting integers to string - this style of handling lists is known as list comprehension.
#List comprehension offers a versatile way to handle list manipulations tasks easily. We'll learn about them in future tutorials. Here's an example.
[str(c) for c in L]
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
[type(item) for item in L]
[int, int, int, int, int, int, int, int, int, int]
Creating Arrays
Numpy arrays are homogeneous in nature, i.e., they comprise one data type (integer, float, double, etc.) unlike lists.
#creating arrays
np.zeros(10, dtype='int')
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
#creating a 3 row x 5 column matrix
np.ones((3,5), dtype=float)
array([[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
#creating a matrix with a predefined value
np.full((3,5),1.23)
array([[ 1.23, 1.23, 1.23, 1.23, 1.23],
[ 1.23, 1.23, 1.23, 1.23, 1.23],
[ 1.23, 1.23, 1.23, 1.23, 1.23]])
#create an array with a set sequence
np.arange(0, 20, 2)
array([0, 2, 4, 6, 8,10,12,14,16,18])
#create an array of even space between the given range of values
np.linspace(0, 1, 5)
array([ 0., 0.25, 0.5 , 0.75, 1.])
#create a 3x3 array with mean 0 and standard deviation 1 in a given dimension
np.random.normal(0, 1, (3,3))
array([[ 0.72432142, -0.90024075, 0.27363808],
[ 0.88426129, 1.45096856, -1.03547109],
[-0.42930994, -1.02284441, -1.59753603]])
#create an identity matrix
np.eye(3)
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
#set a random seed
np.random.seed(0)
x1 = np.random.randint(10, size=6) #one dimension
x2 = np.random.randint(10, size=(3,4)) #two dimension
x3 = np.random.randint(10, size=(3,4,5)) #three dimension
print("x3 ndim:", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)
('x3 ndim:', 3)
('x3 shape:', (3, 4, 5))
('x3 size: ', 60)
Array Indexing
The important thing to remember is that indexing in python starts at zero.
x1 = np.array([4, 3, 4, 4, 8, 4])
x1
array([4, 3, 4, 4, 8, 4])
#assess value to index zero
x1[0]
4
#assess fifth value
x1[4]
8
#get the last value
x1[-1]
4
#get the second last value
x1[-2]
8
#in a multidimensional array, we need to specify row and column index
x2
array([[3, 7, 5, 5],
[0, 1, 5, 9],
[3, 0, 5, 0]])
#1st row and 2nd column value
x2[2,3]
0
#3rd row and last value from the 3rd column
x2[2,-1]
0
#replace value at 0,0 index
x2[0,0] = 12
x2
array([[12, 7, 5, 5],
[ 0, 1, 5, 9],
[ 3, 0, 5, 0]])
Array Slicing
Now, we'll learn to access multiple or a range of elements from an array.
x = np.arange(10)
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
#from start to 4th position
x[: ...
TECHWEEKENDS Presents;
De-cluttering Machine Learning in collaboration with IEEE GGSIPU
Are you clueless when you hear people saying words like Unsupervised Learning and Regression? Worry not❗, GDSC USICT is there for you!!
We are organizing a session on Machine Learning where you will Learn the basics of Machine Learning while developing a Hands-On Project from Scratch and seeing the results in real time. You will also learn different algorithms and models and various Data Preparation Techniques.
OverviewThis hands-on lab allows you to follow and experiment w.docxgerardkortney
Overview:
This hands-on lab allows you to follow and experiment with the critical steps of developing a program including the program description, Analysis, , Design(program design, pseudocode), Test Plan, and implementation with C code. The example provided uses sequential, repetition statements and nested repetition statements.
Program Description:
This program will calculate the average of 10 positive integers. The program will ask the user to 10 integers. If any of the values entered is negative, a message will be displayed asking the user to enter a value greater than 0. The program will use a loop to input the data.
Analysis:
I will use sequential, selection and repetition programming statements.
The program will loop for 10 positive numbers, prompting the user to enter a number.
I will define three integer variables: count, value and sum. count will store how many times values greater than 0 are entered. value will store the input. Sum will store the sum of all 10 integers.
I will define one double number: avg. avg will store the average of the ten positive integers input.
The sum will be calculated by this formula: sum = sum + value For example, if the first value entered was 4 and second was 10: sum = sum + value = 0 + 4
sum = 4 + 10 = 14
Values and sum can be input and calculated within a repetition loop: while count <10
Input value
sum = sum + value End while
Avg can be calculated by: avg = value/count
A selection statement can be used inside the loop to make sure the input value is positive.
If value >= 0 then count = count + 1 sum = sum + value
Else
input value End If
(
7
)
Program Design:
Main
// This program will calculate the average of 10 integer numbers
// Declare variables
// Initialize variables
// Loop through 10 numbers
// Prompt for positive integer
// Get input
// test input value for gt 0 if (value > 0)
//Increment counter
//Accumulate sum Else
// display msg to enter a positive integer
// Prompt for positive integer
// Get input Endif
// End loop
//Calculate average
//Print the results (average)
End
Test Plan:
To verify this program is working properly the input values could be used for testing:
Test Case
Input
Expected Output
1
1 1 1 0 1
2 0 1 3 2
Average = 1.2
2
100 100 100 100 -100
Input a positive value
100 200 -200 200 200
Input a positive value
200 200
average is 120.0
NOTE: test #2 has 12 input numbers because there are two negative numbers.
Pseudocode: Main
// This program will calculate the average of 10 positive integers.
// Declare variables
Declare count, value, sum as Integer Declare avg as double
//Initialize values
Set count=0 Set sum = 0 Set avg = 0.0;
// Loop through 10 integers While count < 10
Input value
If (value >=0)
sum = sum + value count=count+1
Else
Pr *** Value must be positive *** Input value
End if End While
// Calculate average avg = sum/count
// Print results
End //End of Main
C Code
The following is the C Code that will compile in execute in the online.
A quick review and demonstration on how to get started on parallel computing with R. Includes an example of SNOW cluster set up in the departmental lab.
Assignment #9First, we recall some definitions that will be help.docxfredharris32
Assignment #9
First, we recall some definitions that will be helpful in answering questions 1-3
A population parameter is a single value that describes a population characteristic (such as center, spread, location etc.)
EXAMPLES:
· The proportion
p
of adults in the United States who worry about money
· The mean lifetime
m
of a certain brand of computer hard disks
· The lower quartile
1
q
of a population of incomes
· The standard deviation
s
of the nicotine content per cigarette produced by a certain manufacture.
In real life, population parameters are usually unknown. An important objective of statistical inference is to use information obtained from random sample or samples (depending on the design of the study) to estimate parameters and to test claims made about them.
A statistic is a number computed form the sample data only. The resulting sample value must be independent of the population parameters.
Statistics are used as numerical estimates of population parameters.
Example: A random sample of 1500 national adults shows that 33% of Americans worry about money. The margin of error is +/- 3 percentage points.
Statistics have variation. Different random samples of size
n
from the same population will usually yield different values of the same statistic. This is called sampling variability.
The sampling distribution of a statistic is the distribution of the values taken by the statistic over all possible random samples of the same size from a given population.
What do we look for in a sampling distrinution?
Bias: A statistic is unbiased if its sampling distribution has a mean that is equal to the true value of the parameter being estimated by that statistic.
Variability: How much variation is there in the sampling distribution?
The goal of this assignment is to simulate the sampling distribution of some statistics.
Question 1:
An urn contains 50 beads. The beads are identical in shape and have one of two colors: blue and orange. We would like to estimate the proportion
p
of blue beads. We select without replacement a sample of 10 beads. The relevant statistics is the sample proportion
p
ˆ
of blue beads (i.e., the number of blue beads in the sample divided by 10.
For the purpose of the simulation exercise, we will assume that the box contains exactly 15 blue beads or, equivalently, the proportion of blue beads is
30
.
0
=
p
.
i) Select 100 samples of size 10 from the box.
ii) Compute the sample proportion
p
ˆ
of blue beads for each of the 100 samples found in (i).
iii) Make a histogram of the values of
p
ˆ
found in (ii) (that is the approximate sampling distribution of
p
ˆ
.)
iv) Find the summary statistics of the 100 values of
p
ˆ
.
v) Base yourself on the histogram and the summary statistics to describe the approximate sampling distribution of
p
ˆ
.
vi) Is
p
ˆ
an unbiased estimator of
p
? Hint: Evaluate the difference between
30
.
0
=
p
and the mean value. ...
NoSQL - MongoDB. Agility, scalability, performance. I am going to talk about the basis of NoSQL and MongoDB. Why some projects requires RDBMs and another NoSQL databases? What are the pros and cons to use NoSQL vs. SQL? How data are stored and transefed in MongoDB? What query language is used? How MongoDB supports high availability and automatic failover with the help of the replication? What is sharding and how it helps to support scalability?. The newest level of the concurrency - collection-level and document-level.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
Zoom is a comprehensive platform designed to connect individuals and teams efficiently. With its user-friendly interface and powerful features, Zoom has become a go-to solution for virtual communication and collaboration. It offers a range of tools, including virtual meetings, team chat, VoIP phone systems, online whiteboards, and AI companions, to streamline workflows and enhance productivity.
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
Mobile App Development Company In Noida | Drona InfotechDrona Infotech
Looking for a reliable mobile app development company in Noida? Look no further than Drona Infotech. We specialize in creating customized apps for your business needs.
Visit Us For : https://www.dronainfotech.com/mobile-application-development/
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
E-commerce Application Development Company.pdfHornet Dynamics
Your business can reach new heights with our assistance as we design solutions that are specifically appropriate for your goals and vision. Our eCommerce application solutions can digitally coordinate all retail operations processes to meet the demands of the marketplace while maintaining business continuity.
Transform Your Communication with Cloud-Based IVR SolutionsTheSMSPoint
Discover the power of Cloud-Based IVR Solutions to streamline communication processes. Embrace scalability and cost-efficiency while enhancing customer experiences with features like automated call routing and voice recognition. Accessible from anywhere, these solutions integrate seamlessly with existing systems, providing real-time analytics for continuous improvement. Revolutionize your communication strategy today with Cloud-Based IVR Solutions. Learn more at: https://thesmspoint.com/channel/cloud-telephony
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppGoogle
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-fusion-buddy-review
AI Fusion Buddy Review: Key Features
✅Create Stunning AI App Suite Fully Powered By Google's Latest AI technology, Gemini
✅Use Gemini to Build high-converting Converting Sales Video Scripts, ad copies, Trending Articles, blogs, etc.100% unique!
✅Create Ultra-HD graphics with a single keyword or phrase that commands 10x eyeballs!
✅Fully automated AI articles bulk generation!
✅Auto-post or schedule stunning AI content across all your accounts at once—WordPress, Facebook, LinkedIn, Blogger, and more.
✅With one keyword or URL, generate complete websites, landing pages, and more…
✅Automatically create & sell AI content, graphics, websites, landing pages, & all that gets you paid non-stop 24*7.
✅Pre-built High-Converting 100+ website Templates and 2000+ graphic templates logos, banners, and thumbnail images in Trending Niches.
✅Say goodbye to wasting time logging into multiple Chat GPT & AI Apps once & for all!
✅Save over $5000 per year and kick out dependency on third parties completely!
✅Brand New App: Not available anywhere else!
✅ Beginner-friendly!
✅ZERO upfront cost or any extra expenses
✅Risk-Free: 30-Day Money-Back Guarantee!
✅Commercial License included!
See My Other Reviews Article:
(1) AI Genie Review: https://sumonreview.com/ai-genie-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
#AIFusionBuddyReview,
#AIFusionBuddyFeatures,
#AIFusionBuddyPricing,
#AIFusionBuddyProsandCons,
#AIFusionBuddyTutorial,
#AIFusionBuddyUserExperience
#AIFusionBuddyforBeginners,
#AIFusionBuddyBenefits,
#AIFusionBuddyComparison,
#AIFusionBuddyInstallation,
#AIFusionBuddyRefundPolicy,
#AIFusionBuddyDemo,
#AIFusionBuddyMaintenanceFees,
#AIFusionBuddyNewbieFriendly,
#WhatIsAIFusionBuddy?,
#HowDoesAIFusionBuddyWorks
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Mnh csv python
1. Building flexible tools to store
sums and report on CSV data
Presented by
Margery Harrison
Audience level: Novice
09:45 AM - 10:45 AM
August 17, 2014
Room 704
2. Python Flexibility
● Basic, Fortran, C, Pascal, Javascript,...
● At some point, there's a tendency to think
the same way, and just translate it
● You can write Python as if it were C
● Or you can take advantage of Python's
special data structures.
● The second option is a lot more fun.
3. Using Python data structures to
report on CSV data
● Lists
● Sets
● Tuples
● Dictionaries
● CSV Reader
● DictReader
● Counter
4. Also,
● Using tuples as dictionary keys
● Using enumerate() to count how many
times you've looped
– See “Loop like a Native”
http://nedbatchelder.com/text/iter.html
5. Code Development Method
● Start with simplest possible version
● Test and validate
● Iterative improvements
– Make it prettier
– Make it do more
– Make it more general
6. This is a CSV file
color,size,shape,number
red,big,square,3
blue,big,triangle,5
green,small,square,2
blue,small,triangle,1
red,big,square,7
blue,small,triangle,3
13. How many of each?
● It's nice to have a listing that shows the
variety of objects that can appear in each
column.
● Next, we'd like to count how many of each
● And guess what? Python has a special data
structure for that.
19. Output
color
blue : 3
green : 1
red : 2
shape
square : 3
triangle: 3
number
1 : 1
3 : 2
2 : 1
5 : 1
7 : 1
size
small : 3
big : 3
20. You might ask, why not this?
for row in r:
for head in r.fieldnames:
field_value = row[head]
possible_values[head].add(field_value)
#count_of_values.update(row[head])
count_of_values.update(field_value)
print count_of_values
21. Because
Counter({'e': 13, 'l': 12, 'a': 9, 'r': 9, 'g': 7, 'b': 6, 'i': 6, 's':
6, 'u': 6, 'n': 4, 'm': 3, 'q': 3, 't': 3, 'd': 2, '3': 2, '1': 1, '2':
1, '7': 1, '5': 1})
color
blue : 0
green : 0
red : 0
shape
square : 0
triangle: 0
number
1 : 1
3 : 2
2 : 1
5 : 1
7 : 1
size
small : 0
big : 0
22. Output
color
blue : 3
green : 1
red : 2
shape
square : 3
triangle: 3
number
1 : 1
3 : 2
2 : 1
5 : 1
7 : 1
size
small : 3
big : 3
23. How many red squares?
● We can use tuples as an index into the
counter
– (red,square)
– (big,red,square)
– (small,blue,triangle)
– (small,square)
24. Let's use a simpler CSV
color,size,shape
red,big,square
blue,big,triangle
green,small,square
blue,small,triangle
red,big,square
blue,small,triangle
25. Counting Tuples
trying to use magic update()
>>> c=collections.Counter([('a,b'),('c,d,e')])
>>> c
Counter({'a,b': 1, 'c,d,e': 1})
>>> c.update(('a','b'))
>>> c
Counter({'a': 1, 'b': 1, 'a,b': 1, 'c,d,e': 1})
>>> c.update((('a','b'),))
>>> c
Counter({'a': 1, ('a', 'b'): 1, 'b': 1, 'a,b': 1, 'c,d,e': 1})
30. Combo Count Output
color
blue : 3
3 blue in 1 combinations:
('blue', 'big'): 1
('blue', 'small'): 2
3 blue in 2 combinations:
('blue', 'big', 'triangle'): 1
('blue', 'small', 'triangle'): 2
green : 1
1 green in 1 combinations:
('green', 'small'): 1
1 green in 2 combinations:
('green', 'small', 'square'): 1
red : 2
2 red in 1 combinations:
('red', 'big'): 2
2 red in 2 combinations:
('red', 'big', 'square'): 2
shape
square : 3
3 square in 1 combinations:
3 square in 2 combinations:
('red', 'big', 'square'): 2
('green', 'small', 'square'): 1
triangle: 3
3 triangle in 1 combinations:
3 triangle in 2 combinations:
('blue', 'big', 'triangle'): 1
('blue', 'small', 'triangle'): 2
size
small : 3
3 small in 1 combinations:
('blue', 'small'): 2
('green', 'small'): 1
3 small in 2 combinations:
('green', 'small', 'square'): 1
('blue', 'small', 'triangle'): 2
big : 3
3 big in 1 combinations:
('blue', 'big'): 1
('red', 'big'): 2
3 big in 2 combinations:
('red', 'big', 'square'): 2
('blue', 'big', 'triangle'):
1
31. Well, that's ugly
● We need to make it prettier
● We need to write out to a file
● We need to break things up into Classes
32. Printing Combination Levels
Number of Squares
Number of Red Squares
Number of Blue Squares
Number of Triangles
Number of Red Triangles
Number of Blue Triangles
Total Red
Total Blue
33. Indentation per level
● If we're indexing by tuple, then the
indentation level could correspond to the
number of items in the tuple.
● Let's have general methods to format the
indentation level, given the number of
items in the tuple, or input 'level' integer
34. A class write_indent() method
If part of class with counter and msgs dict,
just pass in the tuple:
def write_indent(self, tup_index):
'''
:param tup_index: tuple index into counter
'''
indent = ' ' * len(tup_index)
msg = self.msgs[tup_index]
sum = self.counts[tup_index]
indented_msg = ('{0:s}{1:s}'.format(
indent, msg, sum)
36. Adjustable field widths
Depending on data, we'll want different
field widths
red squares 5
Blue squares 21
Large Red Squares in the Bronx 987654321
37. Using format to format a format
string
>>> f='{{0:{0:d}s}}'.format(3)
>>> f
'{0:3s}'
>>> f='{{0:{0:d}s}}{{1:{1:d}d}}'.format(3,5)
>>> f
'{0:3s}{1:5d}'
>>> f='{{0:s}}{{1:{0:d}s}}{{2:{1:d}d}}'.format(3,5)
>>> f
'{0:s}{1:3s}{2:5d}'
38. Format 3 values
● Our formatting string will print 3 values:
– String of space chars: {0:s}
– Message: {1:[msg_width]s}
– Sum: Right justified {2:-[sum_width]d}
43. SimpleCSVReporter
● Open a CSV File
● Create
– Set of possible values
– Set of possible tuples
– Counter indexed by each value & tuple
● Use IndentMessages to format output lines
59. Improvements
● Allow user-specified order for values, e.g.
FIRST, SECOND, THIRD
● Other means of tabulating
● Keeping track of blank values
● Summing counts in columns
● ...
61. Links
This talk: http://www.slideshare.net/pargery/mnh-csv-python
● https://github.com/pargery/csv_utils2
● Also some notes in http://margerytech.blogspot.com/
Info on Data Structures
● http://rhodesmill.org/brandon/slides/2014-04-pycon/data-structures/
● http://nedbatchelder.com/text/iter.html
DC crime stats
● http://data.octo.dc.gov/
“The data made available here has been modified for use from its original source, which is the Government of the
District of Columbia. Neither the District of Columbia Government nor the Office of the Chief Technology Officer
(OCTO) makes any claims as to the completeness, accuracy or content of any data contained in this application;
makes any representation of any kind, including, but not limited to, warranty of the accuracy or fitness for a
particular use; nor are any such warranties to be implied or inferred with respect to the information or data
furnished herein. The data is subject to change as modifications and updates are complete. It is understood that
the information contained in the web feed is being used at one's own risk."