16. March 2018
Illustration filters of different depths by Yann LeCun (2013)
Credits to Niklas Klein for this example.
kernel_size
neighbouring values select the largest onestride = kernel_size
: non-overlapping valueskernel_size = stride = 3
Classification errors on different tasks and models by Yann LeCun (2013)
Example positive yelp review
Word count distribution train reviews
My go-to ice cream place in the summer! The lines are usually long and you have to wait a while, but they have some delicious snowstorms! They also have a drive-thru, which I have never used, but that is because I like to sit out front at the tables in the parking lot. Perfect place to go with friends on a beautiful summer night for a yummy treat!
I have been to this Mesa AZ location on Alma School Road a few times with good results but this last time Tuesday 8/19/2014 will be my last. The waitress gave me a crazy look when I asked if there was a house italian salad dressing. The deep dish pizza was terrible. They hardly put any cheese on the thing and all is left is one inch of dough for 16 dollars. Just Terrible.
categorical cross entropy
adam
with:
learning rate = 0.001
beta 1 = 0.09
beta 2 = 0.999
decay = 0.95
patience = 3
batch_size = 1000
reviews from the .csv filekeras
via the method model.fit_generator(generator=trainGenerator ...
positive | negative |
---|---|
560K | |
17723 | 406 |
554 | 19317 |
280K | |
16921 | 1208 |
615 | 19256 |
140K | |
16698 | 1431 |
1049 | 18822 |
AUC | Accuracy | |
---|---|---|
560K | 0.9959275 | 0.9747368 |
280K | 0.9892149 | 0.9520263 |
140K | 0.9817623 | 0.9347368 |
CNN as black box model
Illustration of LIME by Ribeiro, Singh, and Guestrin (2016). Blue and pink areas depict the complex decision function of \(f\). Fat cross marks the observation that is to be explained and crosses and bubbles mark perturbations \(z\) of \(x\), resized proportional to proximity. The dashed line is the local learned explanation.
I | like | the | good | expensive | beer | P(positive) | |
---|---|---|---|---|---|---|---|
I like beer | 1 | 1 | 0 | 0 | 0 | 1 | 0.50 |
I like good | 1 | 1 | 0 | 1 | 0 | 0 | 0.80 |
I like good expensive beer | 1 | 1 | 0 | 1 | 1 | 1 | 0.74 |
I the good beer | 1 | 0 | 1 | 1 | 0 | 1 | 0.54 |
expensive beer | 0 | 0 | 0 | 0 | 1 | 1 | 0.27 |
Screenshot from LIME in action
n
and select them (e.g. via lasso, forward-selection, …)n
selected featuresnum_features = 5
: amount of features to be selectednum_samples = 1000
: the size of the design matrixbow = False
: control if all occurencies of a word should be perturbed or one by onefeature_selection = "lasso_path"
: selection method for step 3Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press.
Ribeiro, Marco Túlio, Sameer Singh, and Carlos Guestrin. 2016. “"Why Should I Trust You?": Explaining the Predictions of Any Classifier.” CoRR abs/1602.04938. http://arxiv.org/abs/1602.04938.
Yann LeCun, Marc’Aurelio Ranzato. 2013. “Deep Learning Tutorial.” http://www.cs.nyu.edu/~yann/talks/lecun-ranzato-icml2013.pdf.
Zhang, Xiang, Junbo Zhao, and Yann LeCun. 2015. “Character-Level Convolutional Networks for Text Classification,” 649–57.