The document describes a final project for a BENG 108 course to create a program that identifies Corydoras catfish species. The program allows users to input physical trait observations and returns potential species matches. It also allows administrators to add new species to the database. The project involved downloading species photos, defining distinguishing traits, building a searchable database in LabVIEW, and ensuring the program could identify species from trait combinations. Bugs included skipping the first trait and issues with reusing LED buttons that were addressed.
Working with Dictionaries and ListsSets Modules you can use.pdfadvancesystem
Working with Dictionaries and Lists/Sets
Modules you can use:
argparse, reLinks, osLinks, collectionsLinks, sysLinks
General Guidelines (Steps 1-6)
You have more flexibility to implement your own function names and logic in these programs.
The data files you need for this assignment can obtained from:
HUGO_genes.txt, chr21_genes.txt
Create an output directory inside your assignment4 directory called "OUTPUT" for result files, so
that they will not mix with your programs. Output from your programs will be here!
Pay close attention to how you'll run the run_lints.sh script (see below)
Your program must implement command line options for the infiles it must open, but for
testing purposes it should run by default, so if no command line option is passed at the command
line, the program will still run. This will help in the grading of your program.
Create a Python Module called called io_utils.py. Put this io_utils.py Module in a subdirectory
named assignment4 inside your assignment4 top-level directory (see the tree below). Anytime a
file needs to be opened (read or write) in your programs in this assignment, the program should
call on this module's function get_filehandle. You can then use io_utils.get_filehandle by doing
this at the top of your programs:
from assignment4 import io_utils
# I can then use the module's get_filehandle() function by:
fh_in = io_utils.get_filehandle(infile1, "r") # note the function call "io_utils.get_filehandle()"
You can also import this way (Either way is acceptable, but I will assume in this assignment you
did it the way below. I think this is better for Pycharm)
from assignment4.io_utils import get_filehandle
# I can then use the module's get_filehandle() function by:
fh_in = get_filehandle(infile1, "r") # note the function call "get_filehandle()"
Your final submission must have the following files in bold and must use this directory structure.
Make sure you put a blank __init__.py where I've denoted below, e.g.: assignment4/
assignment4 (see below)
Make sure you have __init__.py only in assignment4/assignment4 folder, the tests folder, the unit
folder, and not in the assignment4 main folder.
Information on Source files
The chr21_genes.txt file lists genes from human chromosome 21, in their order along the
chromosome, as described in Hattori et al. (Nature 405, 311-319)Links to an external site.. For
each gene, the file gives the gene symbol, description and category. The fields are separated by
tabs. You will need to get the the meaning of each category. You can find these meanings in the
original paperLinks to an external site., under the "Gene categories" section. Create a file named
chr21_genes_categories.txt that store this information in tab separated fields:
This will be used in program #2
The HUGO_genes.txt file lists all human genes having official symbol approved by the HUGO
gene nomenclature committeeLinks to an external site. (some have probably changed by now).
For each gene, the file gives its symbo.
Reaction StatisticsBackgroundWhen collecting experimental data f.pdffashionbigchennai
Reaction Statistics
Background
When collecting experimental data from chemical reactions, it’s often useful to generate
statistics based on the data. One experimental measure is the reaction rate in moles per second,
representing the amount of product formed per unit time. If we have a set of these reaction rates
collected in a data file, we can calculate summary statistical information, such as the minimum
and maximum values, the arithmetic mean, variance, and standard deviation.
Finding the minimum and maximum are straightforward: we scan through all the data, and keep
track of the smallest and largest values encountered. The arithmetic mean (or average) is defined
as:
m = (X1+X2+…+Xn)/n
where n is the number of reaction rates, and xi represents one experimental reaction rate. Once
you have the arithmetic mean, the variance can be calculated as the mean of the squares of the
deviations from the mean:
v= ((Xn-m)^2+(X2 – m^2) + …+(Xn-m)^2)/n
where n is the number of reaction rates, xi represents one experimental reaction rate, and m is the
arithmetic mean of the reaction rates. Once you have the variance, you can calculate the
standarddeviation as:
s = sqrt(v)
Assignment
You will develop a C program that reads data from an input text file containing chemical
reaction rates (in moles per second), and computes the minimum, maximum, arithmetic mean,
variance, and standard deviation for that set of data. Your instructor will provide input text files,
which will each contain a series of double values, each on a line of its own within the file. Your
program will read one of these input files into an array of doubles (i.e., it will populate the array
using the data values from the file). Your program will then calculate statistics using that array of
doubles, and will write the results out to a separate output text file.
The goals of this assignment are to provide you with experience reading and writing text data
files, provide you with experience passing an array into a function, and give you more
experience organizing your program into separate C functions.
When defining your C functions, you may either:
Define the functions before they are used by any other functions, OR
Place function prototypes near the top of your code (after all #include directives), and then define
the functions in any order.
Part 1 – Opening Files and Reading Data
Create a new Visual Studio Win32 Console project named reactionstats. Create a new C source
file named project4.c within that project. At the top of the source file, #define
_CRT_SECURE_NO_WARNINGS, and then include stdio.h, math.h, stdlib.h, stdbool.h, and
float.h.
Inside your main function, define the following:
A one-dimensional array of 600 doubles. They do not need to be initialized to anything at this
stage.
An integer variable to hold the number of elements in the array, initialized using the approach
demonstrated in class, using sizeof.
A FILE pointer variable, which will refer to the input data text file..
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...IJCSEA Journal
Collaborative Filtering is generally used as a recommender system. There is enormous growth in the
amount of data in web. These recommender systems help users to select products on the web, which is the
most suitable for them. Collaborative filtering-systems collect user’s previous information about an item
such as movies, music, ideas, and so on. For recommending the best item, there are many algorithms,
which are based on different approaches. The most known algorithms are User-based and Item-based
algorithms. Experiments show that Item-based algorithms give better results than User-based algorithms.
The aim of this paper isto compare User-based and Item-based Collaborative Filtering Algorithms with
many different similarity indexes with their accuracy and performance. We provide an approach to
determine the best algorithm, which give the most accurate recommendation by using statistical accuracy
metrics. The results are compared the User-based and Item-based algorithms with movie recommendation
data set.
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...IJCSEA Journal
Collaborative Filtering is generally used as a recommender system. There is enormous growth in the amount of data in web. These recommender systems help users to select products on the web, which is the most suitable for them. Collaborative filtering-systems collect user’s previous information about an item such as movies, music, ideas, and so on. For recommending the best item, there are many algorithms, which are based on different approaches. The most known algorithms are User-based and Item-based algorithms. Experiments show that Item-based algorithms give better results than User-based algorithms. The aim of this paper isto compare User-based and Item-based Collaborative Filtering Algorithms with many different similarity indexes with their accuracy and performance. We provide an approach to determine the best algorithm, which give the most accurate recommendation by using statistical accuracy metrics. The results are compared the User-based and Item-based algorithms with movie recommendation data set.
Survey: Biological Inspired Computing in the Network SecurityEswar Publications
Traditional computing techniques and systems consider a main process device or main server, and technique details generally
serially. They're non-robust and non-adaptive, and have limited quantity. Indifference, scientific technique details in a very similar and allocated manner, while not a main management. They're exceedingly strong, elastic, and ascendible. This paper offers a short conclusion of however the ideas from biology are will never to style new processing techniques and techniques that even have a number of the beneficial qualities of scientific techniques. Additionally, some illustrations are a device given of however these techniques will be used in details security programs.
Accounting for uncertainty in species delineation during the analysis of envi...methodsecolevol
Tutorial accompanying the paper of the same name, published in Methods in Ecology and Evolution
Full paper
http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00122.x/abstract
ProFET - Protein Feature Engineering ToolkiDan Ofer
Summary of the ProFET project.
This is a newly developed toolkit for end to end machine learning and feature extraction from proteins.
The Code can be freely downloaded here:
https://github.com/ddofer/ProFET
Dan Ofer
I need some help creating Psuedocode for a project using Java. Basic.pdffashionfootwear1
Given the scenario that this class is comprised of ten male students and eleven female students
and you are part of a five member team, how would you respond to the following? What is the
probability that (if you are male) your team is composed of all males or (if you are a female)
your team is composed of all females?
Solution
male students=10
female students=11
total students=10+11=21
probability that the team of 5 is composed of all males or all are females in the team
=10c5/21c5 +11c5/21c5 = 252/20349 +462/20349 = 0.0124+0.0227 = 0.0351.
Working with Dictionaries and ListsSets Modules you can use.pdfadvancesystem
Working with Dictionaries and Lists/Sets
Modules you can use:
argparse, reLinks, osLinks, collectionsLinks, sysLinks
General Guidelines (Steps 1-6)
You have more flexibility to implement your own function names and logic in these programs.
The data files you need for this assignment can obtained from:
HUGO_genes.txt, chr21_genes.txt
Create an output directory inside your assignment4 directory called "OUTPUT" for result files, so
that they will not mix with your programs. Output from your programs will be here!
Pay close attention to how you'll run the run_lints.sh script (see below)
Your program must implement command line options for the infiles it must open, but for
testing purposes it should run by default, so if no command line option is passed at the command
line, the program will still run. This will help in the grading of your program.
Create a Python Module called called io_utils.py. Put this io_utils.py Module in a subdirectory
named assignment4 inside your assignment4 top-level directory (see the tree below). Anytime a
file needs to be opened (read or write) in your programs in this assignment, the program should
call on this module's function get_filehandle. You can then use io_utils.get_filehandle by doing
this at the top of your programs:
from assignment4 import io_utils
# I can then use the module's get_filehandle() function by:
fh_in = io_utils.get_filehandle(infile1, "r") # note the function call "io_utils.get_filehandle()"
You can also import this way (Either way is acceptable, but I will assume in this assignment you
did it the way below. I think this is better for Pycharm)
from assignment4.io_utils import get_filehandle
# I can then use the module's get_filehandle() function by:
fh_in = get_filehandle(infile1, "r") # note the function call "get_filehandle()"
Your final submission must have the following files in bold and must use this directory structure.
Make sure you put a blank __init__.py where I've denoted below, e.g.: assignment4/
assignment4 (see below)
Make sure you have __init__.py only in assignment4/assignment4 folder, the tests folder, the unit
folder, and not in the assignment4 main folder.
Information on Source files
The chr21_genes.txt file lists genes from human chromosome 21, in their order along the
chromosome, as described in Hattori et al. (Nature 405, 311-319)Links to an external site.. For
each gene, the file gives the gene symbol, description and category. The fields are separated by
tabs. You will need to get the the meaning of each category. You can find these meanings in the
original paperLinks to an external site., under the "Gene categories" section. Create a file named
chr21_genes_categories.txt that store this information in tab separated fields:
This will be used in program #2
The HUGO_genes.txt file lists all human genes having official symbol approved by the HUGO
gene nomenclature committeeLinks to an external site. (some have probably changed by now).
For each gene, the file gives its symbo.
Reaction StatisticsBackgroundWhen collecting experimental data f.pdffashionbigchennai
Reaction Statistics
Background
When collecting experimental data from chemical reactions, it’s often useful to generate
statistics based on the data. One experimental measure is the reaction rate in moles per second,
representing the amount of product formed per unit time. If we have a set of these reaction rates
collected in a data file, we can calculate summary statistical information, such as the minimum
and maximum values, the arithmetic mean, variance, and standard deviation.
Finding the minimum and maximum are straightforward: we scan through all the data, and keep
track of the smallest and largest values encountered. The arithmetic mean (or average) is defined
as:
m = (X1+X2+…+Xn)/n
where n is the number of reaction rates, and xi represents one experimental reaction rate. Once
you have the arithmetic mean, the variance can be calculated as the mean of the squares of the
deviations from the mean:
v= ((Xn-m)^2+(X2 – m^2) + …+(Xn-m)^2)/n
where n is the number of reaction rates, xi represents one experimental reaction rate, and m is the
arithmetic mean of the reaction rates. Once you have the variance, you can calculate the
standarddeviation as:
s = sqrt(v)
Assignment
You will develop a C program that reads data from an input text file containing chemical
reaction rates (in moles per second), and computes the minimum, maximum, arithmetic mean,
variance, and standard deviation for that set of data. Your instructor will provide input text files,
which will each contain a series of double values, each on a line of its own within the file. Your
program will read one of these input files into an array of doubles (i.e., it will populate the array
using the data values from the file). Your program will then calculate statistics using that array of
doubles, and will write the results out to a separate output text file.
The goals of this assignment are to provide you with experience reading and writing text data
files, provide you with experience passing an array into a function, and give you more
experience organizing your program into separate C functions.
When defining your C functions, you may either:
Define the functions before they are used by any other functions, OR
Place function prototypes near the top of your code (after all #include directives), and then define
the functions in any order.
Part 1 – Opening Files and Reading Data
Create a new Visual Studio Win32 Console project named reactionstats. Create a new C source
file named project4.c within that project. At the top of the source file, #define
_CRT_SECURE_NO_WARNINGS, and then include stdio.h, math.h, stdlib.h, stdbool.h, and
float.h.
Inside your main function, define the following:
A one-dimensional array of 600 doubles. They do not need to be initialized to anything at this
stage.
An integer variable to hold the number of elements in the array, initialized using the approach
demonstrated in class, using sizeof.
A FILE pointer variable, which will refer to the input data text file..
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...IJCSEA Journal
Collaborative Filtering is generally used as a recommender system. There is enormous growth in the
amount of data in web. These recommender systems help users to select products on the web, which is the
most suitable for them. Collaborative filtering-systems collect user’s previous information about an item
such as movies, music, ideas, and so on. For recommending the best item, there are many algorithms,
which are based on different approaches. The most known algorithms are User-based and Item-based
algorithms. Experiments show that Item-based algorithms give better results than User-based algorithms.
The aim of this paper isto compare User-based and Item-based Collaborative Filtering Algorithms with
many different similarity indexes with their accuracy and performance. We provide an approach to
determine the best algorithm, which give the most accurate recommendation by using statistical accuracy
metrics. The results are compared the User-based and Item-based algorithms with movie recommendation
data set.
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...IJCSEA Journal
Collaborative Filtering is generally used as a recommender system. There is enormous growth in the amount of data in web. These recommender systems help users to select products on the web, which is the most suitable for them. Collaborative filtering-systems collect user’s previous information about an item such as movies, music, ideas, and so on. For recommending the best item, there are many algorithms, which are based on different approaches. The most known algorithms are User-based and Item-based algorithms. Experiments show that Item-based algorithms give better results than User-based algorithms. The aim of this paper isto compare User-based and Item-based Collaborative Filtering Algorithms with many different similarity indexes with their accuracy and performance. We provide an approach to determine the best algorithm, which give the most accurate recommendation by using statistical accuracy metrics. The results are compared the User-based and Item-based algorithms with movie recommendation data set.
Survey: Biological Inspired Computing in the Network SecurityEswar Publications
Traditional computing techniques and systems consider a main process device or main server, and technique details generally
serially. They're non-robust and non-adaptive, and have limited quantity. Indifference, scientific technique details in a very similar and allocated manner, while not a main management. They're exceedingly strong, elastic, and ascendible. This paper offers a short conclusion of however the ideas from biology are will never to style new processing techniques and techniques that even have a number of the beneficial qualities of scientific techniques. Additionally, some illustrations are a device given of however these techniques will be used in details security programs.
Accounting for uncertainty in species delineation during the analysis of envi...methodsecolevol
Tutorial accompanying the paper of the same name, published in Methods in Ecology and Evolution
Full paper
http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00122.x/abstract
ProFET - Protein Feature Engineering ToolkiDan Ofer
Summary of the ProFET project.
This is a newly developed toolkit for end to end machine learning and feature extraction from proteins.
The Code can be freely downloaded here:
https://github.com/ddofer/ProFET
Dan Ofer
I need some help creating Psuedocode for a project using Java. Basic.pdffashionfootwear1
Given the scenario that this class is comprised of ten male students and eleven female students
and you are part of a five member team, how would you respond to the following? What is the
probability that (if you are male) your team is composed of all males or (if you are a female)
your team is composed of all females?
Solution
male students=10
female students=11
total students=10+11=21
probability that the team of 5 is composed of all males or all are females in the team
=10c5/21c5 +11c5/21c5 = 252/20349 +462/20349 = 0.0124+0.0227 = 0.0351.
1. BENG 108 Final Project
Corydoras Species
Identifier Program
Jason Trimble & Max Peterson
2. Core Principles and Goals
Create a program that can identify individual species of the genus
Corydoras given a set of observed physical traits inputted by the user
Make this program easy to use
Create a second program for admins to add new species into the
database
Optimize both for use online at www.planetcatfish.com
3. Key Preliminary Problems
Downloading individual photos of each species
Deciding what physical characteristics would be most beneficial for
species identification
Defining the characteristics that each individual species possesses
Ensuring each characteristic set has enough variation to narrow
species possibilities for a set of observations down to at most 5
Building a Labview accessible database of all the above information
4. Solutions
Use of inspect tool circumventing copyright law
Collaborating with the other groups to create a standard for physical
characteristics such as nose type, dorsal fin pattern and overall body
pattern
Using a google docs excel file to utilize all group members in building
a database
Integrating this excel file into our Labview program
6. Materials
Labview
Property and Invoke Nodes
Local variables
Selector functions
Case structures
Frame structure
String indicators/ LEDs/ next function
Number to decimal string function
Initialize/ index/ build array function
Array max and min function
8. Our Program – User Interface
As mentioned in the program requirements, there needs to be options for
both finding a fish species from traits and adding a new fish species to the
data file.
This was done by using string indicators for displaying these two options and
having the user select an LED.
If the user selects the add new species option, then a name for the species
must be typed in a string control.
The user must confirm his/ her choice by clicking a next button.
The trait options are displayed in a similar fashion with string indicators for
displaying the trait and its options. The user must click a corresponding LED.
Another next button is used to cycle between traits.
9. Our Program – User Interface (cont.)
There are 8 indicators and LEDS for up to 8 trait options, but the higher
numbers are generally not needed and hidden from the user for traits with
only a few options.
The LED buttons and next buttons must be reinitialized in order for them to
remain unlit after switching traits (prevents user confusion and decreases the
chances of selecting multiple options.
For the finding a fish option, the matching fish species (one or a few) with the
corresponding trait options are displayed by their name.
There is a reminder to check the database text file if the add fish species
option was selected.
A welcome message is displayed to the user when starting the program.
11. Our Program – Inner Workings
A 1D array is created for storing the trait choices that the user makes.
A frame structure is used to sequence the events in the program.
For each LED button, there is a selector function which passes forward the
option number (array style 0, 1, …) if the LED was pressed (true). If the LED
was not pressed (false), then a -1 is passed forward.
The selector function answers are combined together and then the maximum
value is chosen since the negative values are false (build array + max/min). If
more than one button is pressed, then the later option is chosen.
A case structure is used to initialize the add a new species option if that was
chosen. A true constant is passes forward for adding to the end of the
database file. The new fish species name is also passed forward.
12. Our Program – Inner Workings (cont.)
Each section of the program is encased in a while loop which prevents the
user from moving forward if the next button is not clicked.
The LEDs, string indicators, and next buttons are reinitialized between frames
of the code (using invoke nodes).
The string indicators and controls are chosen to be visible or invisible for each
section of code (using property nodes).
The trait option value must be greater than or equal to 0 (meaning true) in
order for the number choice to be passed forward (true condition).
For each additional trait, the array of choices is passed forward, a new choice
number is added to the array, local variables are used to reuse the same LED,
string indicator, or next button, and visibility of trait options varies.
15. Our Program – Inner Workings (Finding
Matching Species)
The choice array is converted into strings since this is easier to compare (use
number to decimal string function).
The database file is read in and converted into an array (read from text file and
spreadsheet string to array functions).
Beginning with the first choice and the second column of the database file (skips
fish species names), a for loop cycles through the second column of the data file in
order to find the first choice by seeing whether or not the values are equal. If the
values are equal, the entire row of the file is added to and array. The initialize,
index, and build array functions are used.
The new array and the array of choices are passes forward to the next comparison
segment of the code, where a new array is made by comparing the next values
(such as 2nd choice and 3rd column). This new array is a subset of the previous new
array.
All of the choices are compared until only one or a few matches remain.
16. Our Program – Inner Workings (Finding
Matching Species)
17. Our Program – Inner Workings (Adding a
New Fish Species)
A case structure at the end of the code only executes the true condition if the
user chose earlier that he/ she wants to add a new fish species to the end of
the database file.
The array of choices and the new species name are passed forward with the
name being inserted in front of the values in the 1D array (use insert into
array function).
The 1D array for the new fish species with its name and associated trait
numbers is passed into the original database text file as the last row (use
write to delimited spreadsheet function).
18. Our Program – Inner Workings (Adding a
New Fish Species)
19. Errors and Bug Fixes
Errors
Program kept skipping over first trait.
LEDs stayed lit through the entire program.
Putting the frame structure within a case structure (hard to copy).
Can’t reuse LEDs by copying.
Fixes
A local variable was needed to turn off the next button from previous runs.
Reinitialized LEDs after use.
Putting frames at the beginning and end for adding a fish worked better.
Used local variable for reusing LEDs.
21. Errors and Bug Fixes (cont.)
Errors
Array of choices was all one number.
Comparing numbers and text.
Found species were repeated many times.
Text file became new entries and skipped lines only.
Fixes
Got rid of shift registers when building.
Converted the choice value array into strings.
Did not also write in original text file when adding a new species.
Changed wiring.
24. Conclusion
Both being able to find a species based on traits and adding a new species of
catfish to the end of the database file were functional.
25. If it was Perfect…
The program would constantly calculate possible species even if only one
characteristic is chosen.
Picture attachments as well as a link to the species web page would appear with
the species name in the program .
A perfect database where multiple species do not fit the same set of
characteristics .
The user would not be able to select multiple options or select nothing (could be
more error handling and user-friendly).
Reentry of new species name would not count.