The cloud storage-based zero client technology gains companies’ interest because of its
capabilities in secured and economic management of information resources. As the use of
personal smart devices such as smart phones and pads in business increases, to cope with
insufficient workability caused by limited size and computing capacity most of smart devices
have, the necessity to apply the zero-client technology is being highlighted. However, from the
viewpoint of usability, users point out a very serious problem in using cloud storage-based zero
client system: difficulty in deciding and locating a proper directory to store documents. This
paper proposes a method to enhance the usability of a zero-client-based cloud storage system
by intelligently and pervasively archiving working documents according to automatically
identified topic. Without user’s direct definition of directory to store the document, the proposed
ideas enable the documents to be automatically archived into the predefined directories. Based
on the proposed ideas, more effective and efficient management of electronic documents can be
achieved.
Learn how intelligent data capture has replaced scanning for archival. Understand how recognition technologies and capture software including advanced OCR, barcodes and regex, combine to extract your important data seamlessly from scans and existing files. The time is now to truly turn your content into data.
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...IJCI JOURNAL
Cloud computing is an emanating technology allowing
users to perform data processing, use as storage
and data admission services from around the world t
hrough internet. The Cloud service providers charge
depending on the user’s usage. Imposing confidentia
lity and scalability on cloud data increases the
complexity of cloud computing. As sensitive informa
tion is centralized into the cloud, this informatio
n must
be encrypted and uploaded to cloud for the data pri
vacy and efficient data utilization. As the data be
comes
complex and number of users are increasing searchin
g of the files must be allowed through multiple
keyword of the end users interest. The traditional
searchable encryption schemes allows users to searc
h in
the encrypted cloud data through keywords, which su
pport only Boolean search, i.e., whether a keyword
exists in a file or not, without any relevance of d
ata files and the queried keyword. Searching of dat
a in the
cloud using Single keyword ranked search results to
o coarse output and the data privacy is opposed usi
ng
server side ranking based on order-preserving encry
ption (OPE)
Data Deduplication: Venti and its improvementsUmair Amjad
Data is the primary thing which is available in digital form everywhere. To store this massive data, the storage methodology should be efficient as well as intelligent enough to find the redundant data to save. Data deduplication techniques are widely used by storage servers to eliminate the possibilities of storing multiple copies of the data. Deduplication identifies duplicate data portions going to be stored in storage systems also removes duplication in existing stored data in storage systems. Hence yield a significant cost saving. This paper is about data deduplication, taking Venti as base case discussed it in detail and also identify area of improvements in Venti which are addressed by other papers.
Learn how intelligent data capture has replaced scanning for archival. Understand how recognition technologies and capture software including advanced OCR, barcodes and regex, combine to extract your important data seamlessly from scans and existing files. The time is now to truly turn your content into data.
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...IJCI JOURNAL
Cloud computing is an emanating technology allowing
users to perform data processing, use as storage
and data admission services from around the world t
hrough internet. The Cloud service providers charge
depending on the user’s usage. Imposing confidentia
lity and scalability on cloud data increases the
complexity of cloud computing. As sensitive informa
tion is centralized into the cloud, this informatio
n must
be encrypted and uploaded to cloud for the data pri
vacy and efficient data utilization. As the data be
comes
complex and number of users are increasing searchin
g of the files must be allowed through multiple
keyword of the end users interest. The traditional
searchable encryption schemes allows users to searc
h in
the encrypted cloud data through keywords, which su
pport only Boolean search, i.e., whether a keyword
exists in a file or not, without any relevance of d
ata files and the queried keyword. Searching of dat
a in the
cloud using Single keyword ranked search results to
o coarse output and the data privacy is opposed usi
ng
server side ranking based on order-preserving encry
ption (OPE)
Data Deduplication: Venti and its improvementsUmair Amjad
Data is the primary thing which is available in digital form everywhere. To store this massive data, the storage methodology should be efficient as well as intelligent enough to find the redundant data to save. Data deduplication techniques are widely used by storage servers to eliminate the possibilities of storing multiple copies of the data. Deduplication identifies duplicate data portions going to be stored in storage systems also removes duplication in existing stored data in storage systems. Hence yield a significant cost saving. This paper is about data deduplication, taking Venti as base case discussed it in detail and also identify area of improvements in Venti which are addressed by other papers.
"Analysis of Different Text Classification Algorithms: An Assessment "ijtsrd
Theoretical Classification of information has become a significant research region. The way toward ordering archives into predefined classifications dependent on their substance is Text characterization. It is the mechanized task of common language writings to predefined classifications. The essential prerequisite of content recovery frameworks is content characterization, which recover messages because of a client inquiry, and content getting frameworks, which change message here and there, for example, responding to questions, creating outlines or removing information. In this paper we are concentrating the different grouping calculations. Order is the way toward isolating the information to certain gatherings that can demonstration either conditionally or freely. Our fundamental point is to show the examination of the different characterization calculations like K nn, Na¯ve Bayes, Decision Tree, Random Forest and Support Vector Machine SVM with quick digger and discover which calculation will be generally reasonable for the clients. Adarsh Raushan | Prof. Ankur Taneja | Prof. Naveen Jain "Analysis of Different Text Classification Algorithms: An Assessment" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-1 , December 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29869.pdf Paper URL: https://www.ijtsrd.com/computer-science/other/29869/analysis-of-different-text-classification-algorithms-an-assessment/adarsh-raushan
2016 BE Final year Projects in chennai - 1 Crore Projects 1crore projects
IEEE PROJECTS 2016 - 2017
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Project Domain list 2016
1. IEEE based on datamining and knowledge engineering,
2. IEEE based on mobile computing,
3. IEEE based on networking,
4. IEEE based on Image processing,
5. IEEE based on Multimedia,
6. IEEE based on Network security,
7. IEEE based on parallel and distributed systems
Project Domain list 2016
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2016
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
5. IOT Projects
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US:-
1 CRORE PROJECTS
Door No: 66 ,Ground Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 7708150152
This presentation discusses the following topics:
Object Oriented Databases
Object Oriented Data Model(OODM)
Characteristics of Object oriented database
Object, Attributes and Identity
Object oriented methodologies
Benefit of object orientation in programming language
Object oriented model vs Entity Relationship model
Advantages of OODB over RDBMS
The huge volume of text documents available on the internet has made it difficult to find valuable
information for specific users. In fact, the need for efficient applications to extract interested knowledge
from textual documents is vitally important. This paper addresses the problem of responding to user
queries by fetching the most relevant documents from a clustered set of documents. For this purpose, a
cluster-based information retrieval framework was proposed in this paper, in order to design and develop
a system for analysing and extracting useful patterns from text documents. In this approach, a pre-
processing step is first performed to find frequent and high-utility patterns in the data set. Then a Vector
Space Model (VSM) is performed to represent the dataset. The system was implemented through two main
phases. In phase 1, the clustering analysis process is designed and implemented to group documents into
several clusters, while in phase 2, an information retrieval process was implemented to rank clusters
according to the user queries in order to retrieve the relevant documents from specific clusters deemed
relevant to the query. Then the results are evaluated according to evaluation criteria. Recall and Precision
(P@5, P@10) of the retrieved results. P@5 was 0.660 and P@10 was 0.655.
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESneirew J
ABSTRACT
Data in the cloud is increasing rapidly. This huge amount of data is stored in various data centers around the world. Data deduplication allows lossless compression by removing the duplicate data. So, these data centers are able to utilize the storage efficiently by removing the redundant data. Attacks in the cloud computing infrastructure are not new, but attacks based on the deduplication feature in the cloud computing is relatively new and has made its urge nowadays. Attacks on deduplication features in the cloud environment can happen in several ways and can give away sensitive information. Though, deduplication feature facilitates efficient storage usage and bandwidth utilization, there are some drawbacks of this feature. In this paper, data deduplication features are closely examined. The behavior of data deduplication depending on its various parameters are explained and analyzed in this paper.
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...cscpconf
Materials have become a very important aspect of our daily life and the search for better and
new kind of engineered materials has created some opportunities for the Information science
and technology fraternity to investigate in to the world of materials. Hence this combination of
materials science and Information science together is nowadays known as Materials
Informatics. An Object-Oriented Database Model has been proposed for organizing advanced engineering materials datasets.
Classifying confidential data using SVM for efficient cloud query processingTELKOMNIKA JOURNAL
Nowadays, organizations are widely using a cloud database engine from the cloud service
providers. Privacy still is the main concern for these organizations where every organization is strictly
looking forward more secure environment for their own data. Several studies have proposed different types
of encryption methods to protect the data over the cloud. However, the daily transactions represented by
queries for such databases makes encryption is inefficient solution. Therefore, recent studies presented
a mechanism for classifying the data prior to migrate into the cloud. This would reduce the need of
encryption which enhances the efficiency. Yet, most of the classification methods used in the literature
were based on string-based matching approach. Such approach suffers of the exact match of terms where
the partial matching would not be considered. This paper aims to take the advantage of N-gram
representation along with Support Vector Machine classification. A real-time data will used in
the experiment. After conducting the classification, the Advanced Encryption Standard algorithm will be
used to encrypt the confidential data. Results showed that the proposed method outperformed the baseline
encryption method. This emphasizes the usefulness of using the machine learning techniques for
the process of classifying the data based on confidentiality.
Documento elaborado por Corporate Excellence – Centre for Reputation Leaderhip citando, entre otras fuentes, la obra Brand Premium escrita
por Nigel Hollis, Vicepresidente Ejecutivo y Director Global de Millward Brown, y publicada por Palgrave en 2013.
"Analysis of Different Text Classification Algorithms: An Assessment "ijtsrd
Theoretical Classification of information has become a significant research region. The way toward ordering archives into predefined classifications dependent on their substance is Text characterization. It is the mechanized task of common language writings to predefined classifications. The essential prerequisite of content recovery frameworks is content characterization, which recover messages because of a client inquiry, and content getting frameworks, which change message here and there, for example, responding to questions, creating outlines or removing information. In this paper we are concentrating the different grouping calculations. Order is the way toward isolating the information to certain gatherings that can demonstration either conditionally or freely. Our fundamental point is to show the examination of the different characterization calculations like K nn, Na¯ve Bayes, Decision Tree, Random Forest and Support Vector Machine SVM with quick digger and discover which calculation will be generally reasonable for the clients. Adarsh Raushan | Prof. Ankur Taneja | Prof. Naveen Jain "Analysis of Different Text Classification Algorithms: An Assessment" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-1 , December 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29869.pdf Paper URL: https://www.ijtsrd.com/computer-science/other/29869/analysis-of-different-text-classification-algorithms-an-assessment/adarsh-raushan
2016 BE Final year Projects in chennai - 1 Crore Projects 1crore projects
IEEE PROJECTS 2016 - 2017
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Project Domain list 2016
1. IEEE based on datamining and knowledge engineering,
2. IEEE based on mobile computing,
3. IEEE based on networking,
4. IEEE based on Image processing,
5. IEEE based on Multimedia,
6. IEEE based on Network security,
7. IEEE based on parallel and distributed systems
Project Domain list 2016
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2016
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
5. IOT Projects
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US:-
1 CRORE PROJECTS
Door No: 66 ,Ground Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 7708150152
This presentation discusses the following topics:
Object Oriented Databases
Object Oriented Data Model(OODM)
Characteristics of Object oriented database
Object, Attributes and Identity
Object oriented methodologies
Benefit of object orientation in programming language
Object oriented model vs Entity Relationship model
Advantages of OODB over RDBMS
The huge volume of text documents available on the internet has made it difficult to find valuable
information for specific users. In fact, the need for efficient applications to extract interested knowledge
from textual documents is vitally important. This paper addresses the problem of responding to user
queries by fetching the most relevant documents from a clustered set of documents. For this purpose, a
cluster-based information retrieval framework was proposed in this paper, in order to design and develop
a system for analysing and extracting useful patterns from text documents. In this approach, a pre-
processing step is first performed to find frequent and high-utility patterns in the data set. Then a Vector
Space Model (VSM) is performed to represent the dataset. The system was implemented through two main
phases. In phase 1, the clustering analysis process is designed and implemented to group documents into
several clusters, while in phase 2, an information retrieval process was implemented to rank clusters
according to the user queries in order to retrieve the relevant documents from specific clusters deemed
relevant to the query. Then the results are evaluated according to evaluation criteria. Recall and Precision
(P@5, P@10) of the retrieved results. P@5 was 0.660 and P@10 was 0.655.
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESneirew J
ABSTRACT
Data in the cloud is increasing rapidly. This huge amount of data is stored in various data centers around the world. Data deduplication allows lossless compression by removing the duplicate data. So, these data centers are able to utilize the storage efficiently by removing the redundant data. Attacks in the cloud computing infrastructure are not new, but attacks based on the deduplication feature in the cloud computing is relatively new and has made its urge nowadays. Attacks on deduplication features in the cloud environment can happen in several ways and can give away sensitive information. Though, deduplication feature facilitates efficient storage usage and bandwidth utilization, there are some drawbacks of this feature. In this paper, data deduplication features are closely examined. The behavior of data deduplication depending on its various parameters are explained and analyzed in this paper.
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...cscpconf
Materials have become a very important aspect of our daily life and the search for better and
new kind of engineered materials has created some opportunities for the Information science
and technology fraternity to investigate in to the world of materials. Hence this combination of
materials science and Information science together is nowadays known as Materials
Informatics. An Object-Oriented Database Model has been proposed for organizing advanced engineering materials datasets.
Classifying confidential data using SVM for efficient cloud query processingTELKOMNIKA JOURNAL
Nowadays, organizations are widely using a cloud database engine from the cloud service
providers. Privacy still is the main concern for these organizations where every organization is strictly
looking forward more secure environment for their own data. Several studies have proposed different types
of encryption methods to protect the data over the cloud. However, the daily transactions represented by
queries for such databases makes encryption is inefficient solution. Therefore, recent studies presented
a mechanism for classifying the data prior to migrate into the cloud. This would reduce the need of
encryption which enhances the efficiency. Yet, most of the classification methods used in the literature
were based on string-based matching approach. Such approach suffers of the exact match of terms where
the partial matching would not be considered. This paper aims to take the advantage of N-gram
representation along with Support Vector Machine classification. A real-time data will used in
the experiment. After conducting the classification, the Advanced Encryption Standard algorithm will be
used to encrypt the confidential data. Results showed that the proposed method outperformed the baseline
encryption method. This emphasizes the usefulness of using the machine learning techniques for
the process of classifying the data based on confidentiality.
Documento elaborado por Corporate Excellence – Centre for Reputation Leaderhip citando, entre otras fuentes, la obra Brand Premium escrita
por Nigel Hollis, Vicepresidente Ejecutivo y Director Global de Millward Brown, y publicada por Palgrave en 2013.
IMAGE ACQUISITION IN AN UNDERWATER VISION SYSTEM WITH NIR AND VIS ILLUMINATIONcsandit
The paper describes the image acquisition system able to capture images in two separated
bands of light, used to underwater autonomous navigation. The channels are: the visible light
spectrum and near infrared spectrum. The characteristics of natural, underwater environment
were also described together with the process of the underwater image creation. The results of
an experiment with comparison of selected images acquired in these channels are discussed.
Gestionar proyectos por funcionalidad es ineficiente porque asume que es posible hacer una lista de todo lo que hay que hacer en un proyecto antes de empezar. Si nos equivocamos, perdemos el proyecto. Pero ¿qué pasa si lo hacemos? En esta charla, Lourenço Soares hablará sobre Epicas (o la falta de ellas) y sobre el desarrollo basado en histórias pequeñas.
Figuring out why your program is broke can be hard. If you've added print statements and turned up the logging and still can't find the problem, you might reach for the command-line debugger that comes with perl.
In this presentation, I'll show off an alternative. Devel::hdb is a graphical debugger for Perl that runs inside a web browser with a simpler interface. I'll also briefly cover how to write your own debugging tools with Devel::Chitin, and preview a new feature I've been working on.
Subvención de la Diputación Foral de Bizkaia para la puesta en marcha y creación de empresas. La fecha de convocatoria está aún por concretar. Hasta 25.000 euros por empresa.
2013 рік - підсумки і плани ДемАльянсуVera Gruzova
Демократичний Альянс закликає до наступальних дій в 2014 році
http://dem-alliance.org/news/demokratichnii-aljans-zaklikaje-do-nastupalnih-dii-v-2014-roci.html
Dynamic Resource Allocation and Data Security for CloudAM Publications
Cloud computing is the next generation of IT organization. Cloud computing moves the software and
databases to the large centres where the management of services and data may not be fully trusted. In this system, we
focus on cloud data storage security, which has been an important aspect of quality of services. To ensure the
correctness of user’s data in the cloud, we propose an effective scheme with Advanced Encryption Standard and MD5
algorithm. Extensive security and performance analysis shows that the proposed scheme is highly efficient. In
proposed work we have developed efficient parallel data processing in clouds and present our research project for
parallel security. Parallel security is the data processing framework to explicitly exploit the dynamic storage along with
data security. We have proposed a strong, formal model for data security on cloud and corruption detection.
Efficient and scalable multitenant placement approach for in memory database ...CSITiaesprime
Of late Multitenant model with In-Memory database has become prominent area for research. The paper has used advantages of multitenancy to reduce the cost for hardware, labor and make availability of storage by sharing database memory and file execution. The purpose of this paper is to give overview of proposed Supple architecture for implementing in-memory database backend and multitenancy, applicable in public and private cloud settings. Backend in memory database uses column-oriented approach with dictionary based compression technique. We used dedicated sample benchmark for the workload processing and also adopt the SLA penalty model. In particular, we present two approximation algorithms, multi-tenant placement (MTP) and best-fit greedy to show the quality of tenant placement. The experimental results show that MTP algorithm is scalable and efficient in comparison with best-fit greedy algorithm over proposed architecture.
Cloud storage provides a convenient, massive, and scalable storage at low cost, but data privacy is a major concern that prevents users from storing files on the cloud trustingly.
A systematic review of in-memory database over multi-tenancyIJECEIAES
The significant cost and time are essential to obtain a comprehensive response, the response time to a query across a peer-to-peer database is one of the most challenging issues. This is particularly exact when dealing with large-scale data processing, where the traditional approach of processing data on a single machine may not be sufficient. The need for a scalable, reliable, and secure data processing system is becoming increasingly important. Managing a single in-memory database instance for multiple tenants is often easier than managing separate databases for each tenant. The research work is focused on scalability with multi-tenancy and more efficiency with a faster querying performance using in-memory database approach. We compare the performance of a row-oriented approach and column-oriented approach on our benchmark human resources (HR) schema using Oracle TimesTen in-memory database. Also, we captured some of the key advantages on optimization dimension(s) are the traditional approach, late-materialization, compression and invisible join on column-store (c-store) and row-base. When compression and late materialization are enabled in a query set; it improves the overall performance of query sets. In particular, the paper aims to elucidate the motivations behind multi-tenant application requirements concerning the database engine and highlight major designs over in-memory database for the tenancy approach on cloud.
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESijccsa
Data in the cloud is increasing rapidly. This huge amount of data is stored in various data centers around
the world. Data deduplication allows lossless compression by removing the duplicate data. So, these data
centers are able to utilize the storage efficiently by removing the redundant data. Attacks in the cloud
computing infrastructure are not new, but attacks based on the deduplication feature in the cloud
computing is relatively new and has made its urge nowadays. Attacks on deduplication features in the cloud
environment can happen in several ways and can give away sensitive information. Though, deduplication
feature facilitates efficient storage usage and bandwidth utilization, there are some drawbacks of this
feature. In this paper, data deduplication features are closely examined. The behavior of data deduplication
depending on its various parameters are explained and analyzed in this paper
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESijccsa
Data in the cloud is increasing rapidly. This huge amount of data is stored in various data centers around the world. Data deduplication allows lossless compression by removing the duplicate data. So, these data centers are able to utilize the storage efficiently by removing the redundant data. Attacks in the cloud computing infrastructure are not new, but attacks based on the deduplication feature in the cloud computing is relatively new and has made its urge nowadays. Attacks on deduplication features in the cloud environment can happen in several ways and can give away sensitive information. Though, deduplication feature facilitates efficient storage usage and bandwidth utilization, there are some drawbacks of this feature. In this paper, data deduplication features are closely examined. The behavior of data deduplication depending on its various parameters are explained and analyzed in this paper.
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESijccsa
Data in the cloud is increasing rapidly. This huge amount of data is stored in various data centers around
the world. Data deduplication allows lossless compression by removing the duplicate data. So, these data
centers are able to utilize the storage efficiently by removing the redundant data. Attacks in the cloud
computing infrastructure are not new, but attacks based on the deduplication feature in the cloud
computing is relatively new and has made its urge nowadays. Attacks on deduplication features in the cloud
environment can happen in several ways and can give away sensitive information. Though, deduplication
feature facilitates efficient storage usage and bandwidth utilization, there are some drawbacks of this
feature. In this paper, data deduplication features are closely examined. The behavior of data deduplication
depending on its various parameters are explained and analyzed in this paper.
A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud ...1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud ...1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...dbpublications
Nowadays, cloud-based storage services are rapidly growing and becoming an emerging trend in data storage field. There are many problems when designing an efficient storage engine for cloud-based systems with some requirements such as big-file processing, lightweight meta-data, low latency, parallel I/O, Deduplication, distributed, high scalability. Key-value stores played an important role and showed many advantages when solving those problems. This paper presents about Big File Cloud (BFC) with its algorithms and architecture to handle most of problems in a big-file cloud storage system based on key value store. It is done by proposing low-complicated, fixed-size meta-data design, which supports fast and highly-concurrent, distributed file I/O, several algorithms for resumable upload, download and simple data Deduplication method for static data. This research applied the advantages of ZDB - an in-house key value store which was optimized with auto-increment integer keys for solving big-file storage problems efficiently. The results can be used for building scalable distributed data cloud storage that support big-file with size up to several terabytes.
A database is generally used for storing related, structured data, w.pdfangelfashions02
A database is generally used for storing related, structured data, with well defined data formats,
in an efficient manner for insert, update and/or retrieval (depending on application).
On the other hand, a file system is a more unstructured data store for storing arbitrary, probably
unrelated data. The file system is more general, and databases are built on top of the general data
storage services provided by file systems.
A Data Base Management System is a system software for easy, efficient and reliable data
processing and management. It can be used for:
Creation of a database.
Retrieval of information from the database.
Updating the database.
Managing a database.
It provides us with the many functionalities and is more advantageous than the traditional file
system in many ways listed below:
1) Processing Queries and Object Management:
In traditional file systems, we cannot store data in the form of objects. In practical-world
applications, data is stored in objects and not files. So in a file system, some application software
maps the data stored in files to objects so that can be used further.
We can directly store data in the form of objects in a database management system. Application
level code needs to be written to handle, store and scan through the data in a file system whereas
a DBMS gives us the ability to query the database.
2) Controlling redundancy and inconsistency:
Redundancy refers to repeated instances of the same data. A database system provides
redundancy control whereas in a file system, same data may be stored multiple times. For
example, if a student is studying two different educational programs in the same college, say
,Engineering and History, then his information such as the phone number and address may be
stored multiple times, once in Engineering dept and the other in History dept. Therefore, it
increases time taken to access and store data. This may also lead to inconsistent data states in
both places. A DBMS uses data normalization to avoid redundancy and duplicates.
3) Efficient memory management and indexing:
DBMS makes complex memory management easy to handle. In file systems, files are indexed in
place of objects so query operations require entire file scans whereas in a DBMS , object
indexing takes place efficiently through database schema based on any attribute of the data or a
data-property. This helps in fast retrieval of data based on the indexed attribute.
4) Concurrency control and transaction management:
Several applications allow user to simultaneously access data. This may lead to inconsistency in
data in case files are used. Consider two withdrawal transactions X and Y in which an amount of
100 and 200 is withdrawn from an account A initially containing 1000. Now since these
transactions are taking place simultaneously, different transactions may update the account
differently. X reads 1000, debits 100, updates the account A to 900, whereas X also reads 1000,
debits 200, updates A to 800. In bot.
Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...IJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
QUERY OPTIMIZATION IN OODBMS: IDENTIFYING SUBQUERY FOR COMPLEX QUERY MANAGEMENTcsandit
This paper is based on relatively newer approach for query optimization in object databases,
which uses query decomposition and cached query results to improve execution a query. Issues
that are focused here is fast retrieval and high reuse of cached queries, Decompose Query into
Sub query, Decomposition of complex queries into smaller for fast retrieval of result.
Here we try to address another open area of query caching like handling wider queries. By
using some parts of cached results helpful for answering other queries (wider Queries) and
combining many cached queries while producing the result.
Multiple experiments were performed to prove the productivity of this newer way of optimizing
a query. The limitation of this technique is that it’s useful especially in scenarios where data
manipulation rate is very low as compared to data retrieval rate.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
2. 382
Computer Science & Information Technology (CS & IT)
The cloud storage, a model of networked corporate storage where data is stored in virtualized
pools of storage, provides the Internet-based data storage as a service. One of the biggest merits
of cloud storage is that users can access data in a cloud anytime and anywhere, using any types of
network-enabled user devices [1]. Amazon Web Services S3 (http://aws.amazon.com/s3), Mosso
(http://www.rackspacecloud.com),
Wuala
(http://www.wuala.com),
Google
Drive
(http://drive.google.com), Dropbox (http://www.dropbox.com), uCloud (http://www.ucoud.com),
and nDrive (http://ndrive.naver.com) are typical examples of corporate and personal cloud storage
services. All of these services offer users transparent and simplified storage interfaces, hiding the
details of the actual location and management of resources [2]. Once a document is stored in the
cloud storage, a user can access and download the document anytime and anywhere under the
condition that designated access right has been granted. Because of the advantages in storing and
extracting information resources, more companies are implementing the online storage under the
cloud storage environment.
While the cloud storage can deliver users various benefits, it also has technical limits in network
security as well as in privacy [3]. In addition, from the viewpoint of usability, many users also
point out a very serious problem in using cloud storage-based zero client environment, which is
the difficulty in storing and retrieving documents. Since the directories in the cloud storage has
been defined and structured by companies’ decision, most of users are not accustomed to them.
To store a working document in the cloud storage, a user has to decide a proper directory that
exactly coincides with the contents of the document. Since the directories are naturally varied and
the overall structure of directories is complicated, deciding a proper directory is not an easy work:
sometimes a user can go astray by the confusion in deciding and locating target directory. Also,
when a user tries to retrieve a document, he/she may spend not a little time to locate the file
because too many directories exist. Since the directories are not defined and provided by
him/herself, relatively much time to make a user be accustomed. Therefore, any automated
assistance in concluding the target directory is indispensably needed by analysing contents of the
given document with respect to directories in the cloud storage. Since any keywords or topics
extracted from the document stand for the possible title of the directory under which the
document must be stored, a user can easily complete his/her job to store and retrieve documents.
In retrieving a document from the storage, more accurate and fast searching can be made because
each document has been archived into the topic-based directory.
This research tries to enhance the usability of a zero-client-based cloud storage system by
intelligently and pervasively archiving working documents according to automatically identified
topic. To do so, this research suggests not only a framework to automatically extract the
predefined directory-specific topic of a working document by applying an automatic document
summarization technique, but also required sample codes to pervasively archive documents under
the automatically determined directory.
3. Computer Science & Information Technology (CS & IT)
383
Figure 1. Framework for intelligent and pervasive archiving
2. FRAMEWORK FOR INTELLIGENT AND PERVASIVE ARCHIVING
Figure 1 shows the suggested framework for intelligent and pervasive archiving. Since the cloud
storage plays the role of VDI-enabled database, pervasiveness in document storing and retrieving
can be guaranteed. Intelligent archiving can be attained by two functionalities: one is automatic
document summarization to automatically extract a title of the given document, and the other is
automatic directory search to locate given document onto predefined directory according to the
extracted title.
Once a working document is created by users using their empty-can PC or smart devices, it must
be archived in the cloud storage-based corporate repository because users’ terminals are not
equipped with internal storages. To archive the working document intelligently, the title of the
document must be automatically determined by analysing the words included in the document,
and corresponding directory in the cloud storage must be automatically concluded also according
to the topic or title of the document. Therefore, as a user finishes creating the document and tries
saving it, the module for automatic document summarization initiates its function to extract the
title of the working document according to the procedures as follows;
2.1. File format converting
The file format of the working document can be varied with the types of software used in creating
the document. To guarantee the efficiency of analysis to extract the title of a given document, the
file format must be normalized (or standardized) into analysable one in advance to the rest of
procedures involve [4]. In this research, the file formats are designed to be converted into the
‘.txt’ format to promote the readability of following modules.
2.2. Stemming
Once the document formats have been normalized, words in the document must be also
normalized so that only stems of each word can be considered by separating inflectional and
derivational morphemes from the root, the basic form of a word. For example, the root of the
English verb form ‘preprocessing’ is ‘process-‘; the stem is ‘pre-process-ing’, which includes the
4. 384
Computer Science & Information Technology (CS & IT)
derivational affixes ‘pre-‘ and ‘-ing’, but not the present progressive suffix ‘-ing’. After stemming
each word, non-necessary stems must be eliminated to promote the efficiency of analysis by
setting lists of stop words which need to be filtered out ahead to further analysis.
2.3. Vectorizing
Based on the word stems from the phase of stemming, each stem must be vectorized to extract a
document vector. In many cases, the TF/IDF (Term Frequency/Inverse Term Frequency) is
usually used, and this study also apply it. TF/IDF is a statistical technique to assess relative
importance of a word in a document. A high weight in TF/IDF is obtained by a high term
frequency and a low document frequency of the term in the whole collection of documents; the
weight therefore tends to filter out common terms. The word with the highest TF/IDF is deemed
as the topic of a document.
2.4. Classifying
Resultant topic of a document can be identified by plotting the document vector onto a given
vector spaces prepared by predefined category-based sample data. Therefore, to promote the
accuracy of classification, the quality of sample data is very crucial, and therefore a corpus which
is a collection of predefined categories with sufficient number of example documents must be
formally examined. In conventional text mining area, a classifier is based on various algorithms
such as SVM (Support Vector Machine), Naïve Bayes, and k-NN(Nearest Neighbors), etc. In this
research, a SVM-based classifier is implemented as an example because SVM was reported to
outperform other algorithms [5, 6]. The accuracy of SVM-based classification was also verified
as satisfactory as up to 90% if the prediction model was sufficiently trained using a formal corpus
like Reuter-21578 [7].
The identified topic of a document, then, must be migrated to the directory searching module to
conclude the possible directory under which the document archives. The title of the document
needs to be formulated by combining the topic, the document creator’s ID, and the time of
archiving so that the document can be uniquely identified.
3. TOPIC IDENTIFICATION BASED ON AUTOMATIC DOCUMENT
SUMMARIZATION
Automatic summarization is the process for making reduced version including the most important
points of a given document using the functionality of computer programs. Making summaries
automatically is an indispensable work as the amounts of information and documents increase.
The Summly, an iPhone-based automatic summarization application developed by Nick
D’Aloisio (http://summly.com/) and acquired by Yahoo.com is a typical example proves the
importance of automatic summarization techniques nowadays. There exist two approaches to
automatic summarization: extraction and abstraction. Extractive methods select a subset of
existing words, phrases, or sentences in the original text to make a summary. Abstractive methods
build an internal semantic representation and then use natural language generation techniques to
make a summary that is closer to what a human might generate. Abstractive methods can give a
liberal translation and therefore perform more comprehensive and realistic summarization.
However, because of burdens in implementing and training a prediction model used in concluding
keywords or keyphrases by projecting word vectors onto the n-dimensional corpus-based space,
extractive methods are more widely used rather than abstractive methods.
5. Computer Science & Information Technology (CS & IT)
385
Conventional approaches of extractive methods usually require training the prediction algorithm,
or a classifier, using predefined category-based sample data, and this type of learning procedure is
called as the supervised learning. In supervised learning, each set of sample data is composed of a
pair of a document (or a word) and its associated category in the form of a vector. By reading
sample data, a prediction algorithm can form a vector space constructed by given categories, and
therefore can put the vector of a given document (or word) onto corresponding location within the
vector space. While the supervised methods can produce reliable outputs based on pre-validated
data, they have limitations in application caused by the large amount of training data as well as by
the quality of data sets. Usually a number of documents with identified keywords or keyphrases
are required to train a classifier, and therefore burdens in time and computing capacity are
indispensably exhibited. Moreover, wrong results can be outputted in case of data with biased
subject being inputted. Therefore, to meet with these limitations, unsupervised methods, such as
TextRank [8] and LexRank [9], which eliminate the process of training using sample data are
gaining much interest. The TextRank algorithm exploits the structure of the text itself to
determine keyphrases that appear ‘central’ to the text in the same way that PageRank [10] selects
important Web pages. Because the TextRank enables the application of graph-based ranking
algorithms to natural language texts, it produces results independent to the training data and
language types.
4. PROTOTYPE DESIGN
The prototype system is designed to initiate the function of topic extraction simultaneously with
the user’s trial to save the working document. Indexing the document by tagging the identified
topic with user’s ID and time, the prototype transmits and stores the document into the cloud
storage. A dialogue between a user and the prototype is also needed to check whether the
resultant topic is proper or not. If the user confirms that topic has no problem, the prototype
transmits the file to the cloud storage with tagging required information about the user’s ID and
the time of archiving: Automatic archiving can be completed. Figure 2 shows the sequence of
functions the prototype has.
The Stemming and Vectorizing module are implemented by using ‘Word stemming tool’ and
‘Vector creating tool’ of ‘Yale’, an open source environment for KDD(Knowledge Discovery and
Data mining) and machine learning [11], respectively. As announced previously, this research
deploys the SVM algorithm as the classification method. Therefore, using the LibSVM [12], a
classifier is implemented. Following codes show the procedures to convert file format and to
make the classifier read the file.
6. 386
Computer Science & Information Technology (CS & IT)
Figure 2. Execution sequence of the prototype system
if(predict_probability == 1) {
if(svm_type == svm_parameter.EPSILON_SVR || svm_type ==
svm_parameter.NU_SVR) {
System.out.print("Prob. model for test data: target value = predicted
value + z,nz:
Laplace distribution e^(|z|/sigma)/(2sigma),sigma="+svm.svm_get_svr_probability(model)+"n");
}
else {svm.svm_get_labels(model,labels);
prob_estimates = new double[nr_class];
output.writeBytes("labels");
for(int j=0;j<nr_class;j++)
output.writeBytes(" "+labels[j]);
output.writeBytes("n");
}
}
while(true) {
String line = input.readLine();
if(line == null) break;
StringTokenizer st = new StringTokenizer(line," tnrf:");
double target = atof(st.nextToken());
int m = st.countTokens()/2;
svm_node[] x = new svm_node[m];
for(int j=0;j<m;j++) {
x[j] = new svm_node();
x[j].index = atoi(st.nextToken());
x[j].value = atof(st.nextToken());
}
Identified topic needs to be combined with creator’s (user’s) ID and the time to archive as the
following codes show.
7. Computer Science & Information Technology (CS & IT)
387
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyyMMdd", Locale.KOREA);
String recordedDate = dateFormat.format(new Date());
String tagName = topicName + "-" + userID + "-" + recordedDate;
System.out.printf("Indexing tag is %sn", tagName);
JOptionPane.showMessageDialog(null, tagName);
Under the condition that the cloud storage has i directories (i=1,2,…,n), the document must be
archived in one of existing directories. To search proper directory coincide with the identified
topic, the ‘Hash’ function can be used, and corresponding programming codes can be as follows;
HashMap<String, Integer> categoryHash = new HashMap<String, Integer>();
categoryHash.put("Topic1", 0);
categoryHash.put("Topic2", 1);
categoryHash.put("Topic3", 2);
.
.
.
categoryHash.put("Topici", i-1);
int indexOfCategory = categoryHash.get(topicName);
System.out.printf("Searching result is %d indexn", indexOfCategory);
JOptionPane.showMessageDialog(null, indexOfCategory);
If the searching result is correct and the user confirm it, then a message of processing archiving
needs to be sent to the user, as the following codes show;
try {
BufferedWriter tagFile = new BufferedWriter(new FileWriter(filePath));
tagFile.write(tagName);
tagFile.close();
} catch (IOException e) {
System.err.println(e);
System.exit(1);
}
JOptionPane.showMessageDialog(null, "The document is to be saved as '" +
filePath + "'");
Finally the document is to be archived in the concluded directory with the title of ‘topic-ID-date’,
and a message informing the completion of archiving is to be notified to the user with displaying
the title and location of archiving, as following codes show;
String msg = "The topic of working document is '" + topicName + "' ?";
int ret = JOptionPane.showOptionDialog(null, msg, "Message Window",
JOptionPane.YES_NO_OPTION, JOptionPane.PLAIN_MESSAGE, null, null, null);
switch (ret) {
case JOptionPane.YES_OPTION:
JOptionPane.showMessageDialog(null, "The file '" + tagName + "' has been
archived.");
break;
case JOptionPane.NO_OPTION:
JOptionPane.showMessageDialog(null, "user canceled");
break;
}
5. CONCLUSIONS
Zero-client-based cloud storage is gaining much interest as a tool for centralized management of
organizational documents. Besides the well-known cloud storage’s defects such as security and
8. 388
Computer Science & Information Technology (CS & IT)
privacy protection, users of the zero-client-based cloud storage point out the difficulty in
browsing and selecting the storage directory because of its diversity and complexity. To resolve
this problem, this study proposes a method of intelligent document archiving by applying an
automatic summarization-based topic identification technique. Since the cloud storage plays the
role of VDI-enabled database, pervasive document storing and retrieving can be naturally
enabled. Although not a few researches also tried to enhance the functionality of corporate
archiving systems, no research has suggested the intelligent archiving by automatically attaching
the title of documents to leverage the usability of zero-client-based cloud storage, which is the
main contribution of this study.
Issues in this paper remain points to discuss concerning technical limitations and future works.
Especially, discussions around the algorithms for automatic document summarization need to be
addressed, because the application efficiency of SVM is doubted because of the burden in training
the prediction model. Training the prediction model via server-side computing might be a
solution for this problem, however the computing load a server must endure can also keep
increasing as the use of smart devices increase. Therefore, approaches of unsupervised methods
can yield very effective solutions to meet this problem. However, more formal and statistical
validation on the performance of the unsupervised methods is required to acquire the reputation
supervised methods have gained without the smallest strain. Meanwhile, a formal corpus must be
developed to guarantee the performance of conventional text mining techniques, because most of
conventional algorithms in the area of text mining are much dependent upon the quality of
corpus. More formal, general and universal corpus must be developed so that the results from
applying the corpus can be unbiased and objective. Since the corpus can be applied in setting the
directories of cloud storage, this supplement can also make up for the applicability of intelligent
document archiving suggested by this study.
ACKNOWLEDGEMENTS
This work was supported by the National Research Foundation of Korea Grant funded by the
Korean Government (NRF-2013S1A5A2A01017530).
REFERENCES
[1]
Liu, Q., Wang, G., & Wu, J., (2012) “Secure and privacy preserving keyword searching for cloud
storage services”, Journal of Network and Computer Applications, Vol.35, No.3, 927-933.
[2] Pamies-Juarez, L., García-López, P., Sánchez-Artigas, M., & Herrera, B., (2011) “Towards the design
of optimal data redundancy schemes for heterogeneous cloud storage infrastructures”, Computer
Networks, Vol.55, 1100-1113.
[3] Svantesson, D. & Clarke, R., (2010) “Privacy and consumer risks in cloud computing”, Computer
Law & Security Review, Vol.26, 391-397.
[4] Kim, S., Suh, E., & Yoo, K., (2007) “A study of context inference for Web-based information
systems”, Electronic Commerce Research and Applications, Vol.6, 146-158.
[5] Basu, A., Watters, C., & Shepherd, M., (2003) “Support Vector Machines for Text Categorization”,
Proceedings of the 36th Hawaii International Conference on System Sciences, Vol.4.
[6] Meyer, D., Leisch, F., & Hornik, K., (2003) “The support vector machine under test”,
Neurocomputing, Vol.55, 169-186.
[7] Hsu, C.W., Chang, C.C., & Lin, C.J., (2001) “A Practical Guide to Support Vector Classification:
LibSVM Tutorial”. In http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf.
[8] Mihalcea, R. & Tarau, P., (2004) “TextRank: Bringing order into texts”, Proceedings of EMNLP,
Vol.4, No.4.
[9] Erkan, G. & Radev, D.R., (2004) “LexRank: Graph-based lexical centrality as salience in text
summarization”, Journal of Artificial Intelligence Research, Vol.22, No.1, 457-479.
[10] Brin, S. & Page, L., (1998) “The anatomy of a large-scale hypertextual Web search engine”,
Computer Networks and ISDN Systems, Vol.30, 1-7.
9. Computer Science & Information Technology (CS & IT)
389
[11] Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., & Euler, T., (2006) “YALE: Rapid Prototyping
for Complex Data Mining Tasks”, Proceedings of the 12th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (KDD-06).
[12] Chang, C. & Lin, C., (2011) “LIBSVM: a library for support vector machines”, ACM Transactions on
Intelligent Systems and Technology, Vol.2, No.3, 1-27.
AUTHOR
Keedong Yoo is an associate professor in the Department of MIS at Dankook University,
South Korea (kdyoo@dankook.ac.kr). He has B.S. and M.S. in Industrial Engineering
from the POSTECH (Pohang University of Science and Technology), South Korea; and a
Ph.D. in Management and Industrial Engineering from the POSTECH. His research
interests include knowledge management and service; intelligent and autonomous
systems; context-aware and pervasive computing-based knowledge systems.