Deep learning is pattern recognition via so-called neural networks. Neural networks are a set of algorithms, modeled after the human brain. They are sensors: a form of machine perception. Deep learning is a name for a certain type of stacked neural network composed of several node layers. Each layer’s output is simultaneously the subsequent layer’s input, starting from an initial input layer.

Deep-learning networks are distinguished from the more commonplace single-hidden-layer neural networks by their depth; that is, the number of node layers through which data is passed in a multistep process of pattern recognition. Three or more including input and output is deep learning. Anything less is simply machine learning.

Deep learning is motivated by intuition, theoretical arguments from circuit theory, empirical results, and current knowledge of neuroscience.

- The main concept in deep learning algorithms is automating the extraction of representations (abstractions) from the data.

- A key concept underlying Deep Learning methods is distributed representations of the data, in which a large number of possible configurations of the abstract features of the input data is feasible, allowing for a compact representation of each sample and leading to a richer generalization.

- Deep learning algorithms lead to abstract representations because more abstract representations are often constructed based on less abstract ones.An important advantage of more abstract representations is that they can be invariant to the local changes in the input data.

- Deep learning algorithms are actually Deep architectures of consecutive layers.

- Stacking up the nonlinear transformation layers is the basic idea in deep learning algorithms.

- It is important to note that the transformations in the layers of deep architecture are non-linear transformations which try to extract underlying explanatory factors in the data.

- The final representation of data constructed by the deep learning algorithm (output of the final layer) provides useful information from the data which can be used as features in building classifiers, or even can be used for data indexing and other applications which are more efficient when using abstract representations of data rather than high dimensional sensory data.

Let’s understand in layman’s terms-

Imagine you’re building a shopping recommendation engine, and you discover that if an item is trending *and *a user has browsed the category of that item in the last day, they are very likely to buy the trending item.

These two variables are so accurate together that you can combine them into a new single variable, or **feature **(Call it “interested_in_trending_category”, for example).

Finding connections between variables and packaging them into a new single variable is called **feature engineering**

Deep learning is *automated *feature engineering.

References:

1. http://stats.stackexchange.com/

2. http://stackoverflow.com/

3. https://www.quora.com/