Search This Blog

Computer Science Enthusiasts in SIS-SMU

Algorithms, Machine Learning, Operations Systems, Programming Languages and more.

Word vector

Get link
Facebook
X
Pinterest
Email
Other Apps

By James Hoang - May 06, 2016

We present a word vector using in deep learning. A definition and a co-occurence matrix of word vector are briefly introduced.

word vector

Get link
Facebook
X
Pinterest
Email
Other Apps

Comments

Feature scaling

By James Hoang - October 30, 2015

1. What is feature scaling? Feature scaling is a method used to normalize all independent values of our data. We also call it as data normalization and is generally performed before running machine learning algorithms. 2. Why do we need to use feature scaling? In practice the range of raw data is very wide and hence the object functions will not work properly (means that it will stuck at local optimum), or they are time-consuming without normalization. For example: K-Means, might give you totally different solutions depending on the preprocessing methods that you used. This is because an affine transformation implies a change in the metric space: the Euclidean distance between two samples will be different after that transformation. When we apply gradient descent, feature scaling also helps it to converge much faster that without normalized data. With and without feature scaling in gradient descent 3. Methods in feature scaling? Rescaling The simp...

Manifold Learning [unfinished]

By Unknown - October 02, 2015

I. What is manifold learning Dimensionality Reduction The accuracy of the training algorithms is directly proportional to the amount of data we have. However, managing a large number of features is usually a burden to our algorithm. Some of these features may be irrelevant, so it's important to make sure that the final model doesn't get affected by this. What is Dimensionality? Dimensionality refers to the minimum number of coordinates needed to specify any point within a space or an object. Why do we need Dimensionality Reduction? If you keep the dimensionality high, it will be nice and unique but it may not be easy to analyse because of complexity involved. Apart from simplifying data, visualization is also an interesting and challenging application. Linear Dimensionality Reduction Fig 1 : PCA illustration (source: http://evelinag.com/blog ) Principle Component Analysis Given a data set, PCA finds the directions along which the d...

Feature selection [TODO]

By James Hoang - September 27, 2015

I. Introduction Feature selection is the process of selecting feature or sub set of features for build model in machine learning. A good feature selection method leads to build a better performing model, and hence helps to improve our results. In particular, there are two reasons explaining why feature selection is important: If we are able to reduce number of features, it means that we reduce 'over-fitting' and improve the generalization of models. Moreover, it also helps to reduce the measurement and storage requirements, reducing training and utilization times. We gain a better understanding of the underlying structure and characteristics of the data and leads to better intuition of the algorithm. In this blog, we briefly present some popular methods in the feature selection. We use python and scikit-learn to run all the experiments for feature selection. The data including 506 samples and 13 features is available at (http://scikit-learn.org/stable/datasets/ ). ...

Contributors

James Hoang
Loc Do
Unknown

Labels

Algorithms
Dimensionality Reduction
Feature selection
Java
Machine Learning
Optimization
Statistics