Semantic Pattern Recognition Based on Linear Algebra and Latent Semanti Analysis

Pattern recognition is a process of identifying vector of correlated/uncorrelated attributes and discriminate it among other patterns. Pattern recognition is synonymous to machine learning, data mining and Knowledge Discovery in Database (KDD).In this research work we investigate decomposing pattern (i.e., attribute vector) space into subspaces in which patterns cluster around basis of the subspaces. This paper introduces a theory which states that in case of having space of vectors and having basis then Signal Value Decomposition (SVD) can perform excellent in discovering thesis basis, hence, in pattern recognition a space can be decomposed to sub-spaces to reach clustering around basis. Results are collected and discussed and it has proven that SVD and its extension Latent Segment Analysis (LSA) can optimize the process of machine learning and showed a great tendency to converge toward cognitive based recognition.


Introduction
Pattern recognition is defined generally as the assignment of a label to a given fixed size input stream of attributes [1], for example the classification process is an outcome of pattern recognition.Pattern recognition has its origin in engineering while machine learning has its origin in computer science, anyway, huge number of applications exploit the methodologies of pattern recognition to accomplish its tasks .
Conventionally, pattern is a vector of N-Dimensions column vector called a feature vector.
Feature and attribute are used interchangeably in publishing; this vector is called the vector space which describes a state of a system, a collection of these vectors represent the system space [1,2].Studies [1,2,3,4,5] have been forwarded to investigate the characteristics of the components composing the vector space, for example the linearity in dependency among to label these classes [4,5].In our approach presented in this paper we will analyze problem space in the semantic space in order to reveal hidden relationships among features (i.e., attributes) of the vector space.
LSA is an analysis tool used to analyze the relation between structures of concepts and the documents containing these concepts [1,5,7].LSA is not used only with natural language but it can be used with any collection of documents that are composed of structures (i.e., items).LSA use partial information of the image to constitute the methods to achieve global implications.
The features of the local information reflected as the text of the meaning of a particular string is still called semantic.The semantic features are not expressed in specific form of intuition, but specific from implicit in the data.The purpose of the LSA used in image analysis is an attempt to extract the high-level semantic concepts from the visual features of the image.And hope it can fill the gap between the low content features and high-level features.

1-Latent Semantic Analysis (LSA)
LSA is a theory and method for extracting and representing the meaning of words.Meaning is estimated using statistical computations applied to a large corpus of text [4].The corpus embodies a set of mutual constraints that largely determine the semantic similarity of words and sets of words.These constraints can be solved using linear algebra methods, in particular, singular value decomposition [4,5].
LSA has been shown to reflect human knowledge in a variety of ways.For example, LSA measures correlate highly with humans' scores on standard vocabulary and subject matter tests; The core processing in LSA is to decompose A using SVD; SVD has designed to reduce a dataset containing a large number of values to a dataset containing significantly fewer values, but which still contains a large fraction of the variability present in the original data [3,4,5].In LSA data is subjected to two-part transformation: 1-The word frequency (+1) in each cell is converted to its log.
2-The information-theoretic measure, entropy, of each word is computed as (  log ) over all entries in its row and each cell entry then divided by the row entropy value.
The mentioned two parts transformation is crucial to build the semantic space of the system modeled by the matrix, where, words or features are weighted as an estimate of its importance in the passage [4].
This approach of representing patterns will results in a better analysis and details as it is clear when subjecting these patterns to Histogram analysis.
Hypothesis to be investigated by this paper is that 'Semantic attributes vector for the original pattern is a basis vector in a subspace', which means that all other vectors are spanned by the original pattern.Figure (1) shows the representation of the patterns in the geometrical space and it is shown that vectors can be recognized by interpreting the results of the inner product.The analysis of the semantic space constructed by the above matrix is shown in the figure (4)

Conclusions
From the results we concluded: 1. Applying SVD reduced the computation load due to cancel less affective components of the pattern and consider only the components that encapsulate the system dynamic.

2.
Patterns are holding semantic relationships among their components through which these patterns can be recognized efficiently; LSA technique captures these semantic relationships and deploy it to recognize patterns with semantic domain.
‫بين‬ ‫ذلك‬ ‫وتمييز‬ ‫الترابط‬ ‫/عدم‬ ‫الترابط‬ ‫صفات‬ ‫بتحديد‬ ‫األنماط‬ ‫على‬ ‫التعرف‬ ‫عملية‬ ‫هو‬ ‫األنماط‬ ‫تمييز‬ ‫لتعل‬ ‫مرادف‬ ‫من‬ ‫األنماط‬ ‫تمييز‬ ‫من‬ ‫التحقق‬ ‫تم‬ ‫العمل‬ ‫هذا‬ ‫في‬ ‫البيانات.‬‫قاعدة‬ ‫في‬ ‫المعرفة‬ ‫واكتشاف‬ ‫البيانات‬ ‫الستخراج‬ ‫اآللة‬ ‫يم‬ ‫المفردة(‬ ‫القيمة‬ ‫أساس‬ ‫على‬ ‫تفكيك‬ ‫خوارزمية‬ ‫باستخدام‬ ‫الجزيئية‬ ‫الفضاءات‬ ‫مع‬ ‫وعالقته‬ ‫المدخالت‬ ‫فضاء‬ ‫تحليل‬ ‫خالل‬ SVD ) ‫إل‬ ‫ممتاز‬ ‫بشكل‬ ‫تؤدي‬ ‫أن‬ ‫ممكن‬ ‫الخوارزمية‬ ‫وهذه‬ ‫أثبتت‬ ‫ومناقشتها‬ ‫النتائج‬ ‫أن‬ ‫الجزيئية.‬‫الفضاءات‬ ‫أساس‬ ‫حول‬ ‫أنماط‬ ‫اكتشاف‬ ‫ى‬ vector's components and the correlation in sub-spaces of the problem space.In this paper we investigate the decomposing of the problem space into sub-spaces or clusters based on the outcome of SVD and LSA analysis.The problem domain for this research is the recognition of hand written scripts which are represented by vector of attributes (i.e., attributes are statistical Vol: 13 No:1 , January 2017 DOI : http://dx.doi.org/10.24237/djps.1301.60AP-ISSN: 2222-8373 E-ISSN: 2518-9255 calculations that characterize each vector space).In [4] a description for pattern classifiers is presented in which pattern classifiers are usually based on heuristic feature extraction in order to grant simple classifier the ability to perform with high accuracy.The features frequently used in character recognition include the chain code feature, K-L expansion, the Gabor transform and many others.The more discriminative feature extracted from the image the more accurate the performance of the classifier.Clustering techniques also used to increase the performance of pattern recognition due to the reduction of the search space where problem space is categorized into classes where each individual belongs to certain class is close to other vector space in that class; this can be interpreted in different schemes based on the methodology used

Where 1 -
(  ) → ()2-(  ) → ()3-(  )  (  ) →  the first structure is the single pattern that represent the most variance in the data, after all, SVD is an orthogonal analysis for dataset, U is composed of eigenvectors of the variance-covariance matrix of the data, where the first eigenvector points to the direction which holds the most variability produced by all other vectors jointly.U is an orthogonal matrix where all its structures are mutually uncorrelated.Eigne values are representing scalar variance of corresponding eigenvectors; this way total variation exhibited by the data is the sum of all eigenvalues and singular values are the square root of the eigenvalues[4,6,7,8].
[6] a detailed description for the LTP (Local Ternary Patterns ) is presented, where researchers agreed on using LTP as highly discriminative features for texture classification and as a highly resistivity for lighting effects.Patterns are expressed here in three values fashion where each pixel within the original pattern is represented by the following function:

Figure 1 :
Figure 1: Geometrical Representation for the Proposed Pattern Recognition

Figure 4 :Figure 5 :
Figure 4: Analysis of Attribute Vector Space Using SVD and LSA

Table 1 : Statistical Calculations for Each Hand Script Images
Table-1 represents the statistical calculations for hand script images for each pattern.