Hello, and welcome!
In this video, we'll provide an overview of Recursive Neural Tensor Networks, as well
as the natural language processing problems that they're able to solve.
Sentiment Analysis is the task of identifying and extracting subjective information, like
emotion or opinion, from a source material.
For example, this might involve analyzing a twitter feed to determine which tweets express
a positive feeling, which express a negative feeling, and which are neutral.
In order to classify sentences into different sentiment classes, we'll need a dataset to
use for training.
One potential dataset is the Stanford Sentiment Treebank.
Each data point is the syntax tree of a rotten tomatoes review.
The tree itself and all the subtrees are labeled with a sentiment value from 1 to 25.
25 is the best possible review, while 1 is the worst.
The dataset was created by Stanford researchers, who utilized Amazon's Mechanical Turk platform
in order to assign values.
Recursive neural models can be used for the sentiment analysis problem.
These types of models are characterized by their use of vector representations.
Vectors are used to represent words, as well as all sub-sentences related to an input's
syntax tree.
The word representations are trained with the model, and the representations of sub-sentences
are calculated with a compositionality function.
To calculate the sub-sentence's representations, we apply the compositionality function bottom-up
according to the input's parse tree.
All vectors are fed to the same softmax classifier to determine the sentiment.
The choice of compositionality function is important, so we'll present three different
types of recursive models, each with a different function.
The first model we'll look at is the basic Recursive Neural Network.
To compute our word composition, we start with our vectors that we want to combine,
which we'll call "b" and "c".
We form a "two d" by "d"
matrix by concatenating "b" and "c".
This new matrix is multiplied by the "d" by "two d"
weight matrix "W".
"W" is the model's main training parameter.
Then a nonlinearity is applied element-wise to the resulting vector.
In this case, the nonlinearity is the hyperbolic tangent function.
As a brief note, we?ve omitted the bias for simplicity.
Other models use this compositionality function, like the recursive autoencoder, and recursive
auto-associative memories.
As you can see, the words only interact implicitly through the nonlinearity, so the compositionality
function may not be consistent with linguistic principles.
The model also ignores reconstruction loss, since the dataset is large enough to compensate.
Now let's move on to Matrix-Vector Recursive Neural Networks.
This type of model is a linguistically-motivated improvement over the basic recursive neural
network.
The big change is that now every word is represented by both a vector and a "d" by "d" matrix.
The compositionality function that you see here takes four objects.
Lowercase "b" and "c" are the word vectors, while the uppercase "b" and "c" are the respective matrices.
Lowercase "p1" is the resulting vector, while uppercase "P1" is the respective matrix.
Just like with basic recursive neural networks, a matrix "W" is multiplied with a matrix created
from the word's representations.
But in this case, the matrix created is much more dependent on the relationship between
the two input words.
The problem with this model is that the number of trainable parameters becomes too large
as the vocabulary size increases.
The Recursive Neural Tensor Network, or RNTN, uses a powerful fixed-size compositionality
function that only takes the word's vectors as arguments.
The model is not parameterized by matrices but it adds a "two d" by "d" by "d"
tensor that is used in the function.
This tensor is also trained with the model.
Each of the "d"
slices captures a different type of composition, so intuitively, it is more capable of learning
than the basic recursive neural network.
It turns out that RNTNs outperform the known alternative methods.
It has achieved over eighty-seven percent accuracy in positive negative word classification,
and over eighty-five percent accuracy in positive negative sentence classification on the Stanford
Sentiment Treebank.
This is a sentence classification accuracy that's more than three percent higher compared
to normal Recurrent Networks.
Recursive Neural Tensor Networks can also be used in other applications, such as Parsing
Natural scenes, and Parsing Natural languages.
This is due to the recursive nature of these problems.
If you're interested in learning more about RNTNs, we recommend you follow the link here
to a great article by Socher, and others.
By now, you should understand the intuition behind recursive neural models, and recursive
neural tensor networks.
Thank you for watching this video.
Không có nhận xét nào:
Đăng nhận xét