Introduction to Advanced Machine Learning, 第二周,Tensorflow-task(hse-aml/intro-to-dl,简单注释,答案,附图)

本文主要是介绍Introduction to Advanced Machine Learning, 第二周,Tensorflow-task(hse-aml/intro-to-dl,简单注释,答案,附图),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

这是俄罗斯高等经济学院的系列课程第一门,Introduction to Advanced Machine Learning,第二周第一个编程作业。
这个作业一共两个任务,难易程度:容易。
1. 熟悉TensorFlow,计算RMS
2. 使用Logistic Regress对MNIST图片进行分类,是一个二元分类问题。

Going deeper with Tensorflow

In this video, we’re going to study the tools you’ll use to build deep learning models. Namely, Tensorflow.

If you’re running this notebook outside the course environment, you’ll need to install tensorflow:
* pip install tensorflow should install cpu-only TF on Linux & Mac OS
* If you want GPU support from offset, see TF install page

import sys
sys.path.append("..")
import grading

Visualization

Plase note that if you are running on the Coursera platform, you won’t be able to access the tensorboard instance due to the network setup there. If you run the notebook locally, you should be able to access TensorBoard on http://127.0.0.1:7007/

! killall tensorboard
import os
os.system("tensorboard --logdir=/tmp/tboard --port=7007 &");
/bin/sh: 1: killall: not found
import tensorflow as tf
s = tf.InteractiveSession()

Warming up

For starters, let’s implement a python function that computes the sum of squares of numbers from 0 to N-1.

import numpy as np
def sum_sin(N):return np.sum(np.arange(N)**2)
%%time
sum_sin(10**8)
CPU times: user 412 ms, sys: 344 ms, total: 756 ms
Wall time: 1.06 s662921401752298880

Tensoflow teaser

Doing the very same thing

# An integer parameter
N = tf.placeholder('int64', name="input_to_your_function")# A recipe on how to produce the same result
result = tf.reduce_sum(tf.range(N)**2)
result
<tf.Tensor 'Sum:0' shape=() dtype=int64>
%%time
#result.eval({N: 10**8})
s.run(result,{N:10**8})
CPU times: user 488 ms, sys: 144 ms, total: 632 ms
Wall time: 477 ms662921401752298880
writer = tf.summary.FileWriter("/tmp/tboard", graph=s.graph)

How does it work?

  1. Define placeholders where you’ll send inputs
  2. Make symbolic graph: a recipe for mathematical transformation of those placeholders
  3. Compute outputs of your graph with particular values for each placeholder
    • output.eval({placeholder:value})
    • s.run(output, {placeholder:value})

So far there are two main entities: “placeholder” and “transformation”
* Both can be numbers, vectors, matrices, tensors, etc.
* Both can be int32/64, floats, booleans (uint8) of various size.

  • You can define new transformations as an arbitrary operation on placeholders and other transformations
    • tf.reduce_sum(tf.arange(N)**2) are 3 sequential transformations of placeholder N
    • There’s a tensorflow symbolic version for every numpy function
    • a+b, a/b, a**b, ... behave just like in numpy
    • np.mean -> tf.reduce_mean
    • np.arange -> tf.range
    • np.cumsum -> tf.cumsum
    • If if you can’t find the op you need, see the docs.

tf.contrib has many high-level features, may be worth a look.

with tf.name_scope("Placeholders_examples"):# Default placeholder that can be arbitrary float32# scalar, vertor, matrix, etc.arbitrary_input = tf.placeholder('float32')# Input vector of arbitrary lengthinput_vector = tf.placeholder('float32', shape=(None,))# Input vector that _must_ have 10 elements and integer typefixed_vector = tf.placeholder('int32', shape=(10,))# Matrix of arbitrary n_rows and 15 columns# (e.g. a minibatch your data table)input_matrix = tf.placeholder('float32', shape=(None, 15))# You can generally use None whenever you don't need a specific shapeinput1 = tf.placeholder('float64', shape=(None, 100, None))input2 = tf.placeholder('int32', shape=(None, None, 3, 224, 224))# elementwise multiplicationdouble_the_vector = input_vector*2# elementwise cosineelementwise_cosine = tf.cos(input_vector)# difference between squared vector and vector itself plus onevector_squares = input_vector**2 - input_vector + 1
my_vector =  tf.placeholder('float32', shape=(None,), name="VECTOR_1")
my_vector2 = tf.placeholder('float32', shape=(None,))
my_transformation = my_vector * my_vector2 / (tf.sin(my_vector) + 1)
print(my_transformation)
Tensor("truediv:0", shape=(?,), dtype=float32)
dummy = np.arange(5).astype('float32')
print(dummy)
my_transformation.eval({my_vector:dummy, my_vector2:dummy[::-1]})
[ 0.  1.  2.  3.  4.]array([ 0.        ,  1.62913239,  2.09501147,  2.62899613,  0.        ], dtype=float32)
writer.add_graph(my_transformation.graph)
writer.flush()

TensorBoard allows writing scalars, images, audio, histogram. You can read more on tensorboard usage here.

Summary

  • Tensorflow is based on computation graphs
  • The graphs consist of placehlders and transformations

Mean squared error

Your assignment is to implement mean squared error in tensorflow.

with tf.name_scope("MSE"):y_true = tf.placeholder("float32", shape=(None,), name="y_true")y_predicted = tf.placeholder("float32", shape=(None,), name="y_predicted")# Your code goes here# You want to use tf.reduce_mean# mse = tf.<...>mse = tf.reduce_mean((y_true - y_predicted)**2)
def compute_mse(vector1, vector2):return mse.eval({y_true: vector1, y_predicted: vector2})
writer.add_graph(mse.graph)
writer.flush()

Tests and result submission. Please use the credentials obtained from the Coursera assignment page.

Variables

The inputs and transformations have no value outside function call. This isn’t too comfortable if you want your model to have parameters (e.g. network weights) that are always present, but can change their value over time.

Tensorflow solves this with tf.Variable objects.
* You can assign variable a value at any time in your graph
* Unlike placeholders, there’s no need to explicitly pass values to variables when s.run(...)-ing
* You can use variables the same way you use transformations

# Creating a shared variable
shared_vector_1 = tf.Variable(initial_value=np.ones(5),name="example_variable")
# Initialize variable(s) with initial values
s.run(tf.global_variables_initializer())# Evaluating shared variable (outside symbolicd graph)
print("Initial value", s.run(shared_vector_1))# Within symbolic graph you use them just
# as any other inout or transformation, not "get value" needed
Initial value [ 1.  1.  1.  1.  1.]
# Setting a new value
s.run(shared_vector_1.assign(np.arange(5)))# Getting that new value
print("New value", s.run(shared_vector_1))
New value [ 0.  1.  2.  3.  4.]

tf.gradients - why graphs matter

  • Tensorflow can compute derivatives and gradients automatically using the computation graph
  • True to its name it can manage matrix derivatives
  • Gradients are computed as a product of elementary derivatives via the chain rule:

f(g(x))x=f(g(x))g(x)g(x)x ∂ f ( g ( x ) ) ∂ x = ∂ f ( g ( x ) ) ∂ g ( x ) ⋅ ∂ g ( x ) ∂ x

It can get you the derivative of any graph as long as it knows how to differentiate elementary operations

my_scalar = tf.placeholder('float32')scalar_squared = my_scalar**2# A derivative of scalar_squared by my_scalar
derivative = tf.gradients(scalar_squared, [my_scalar, ])
derivative
[<tf.Tensor 'gradients/pow_1_grad/Reshape:0' shape=<unknown> dtype=float32>]
import matplotlib.pyplot as plt
%matplotlib inlinex = np.linspace(-3, 3)
x_squared, x_squared_der = s.run([scalar_squared, derivative[0]],#What does the [0] mean?{my_scalar:x})plt.plot(x, x_squared,label="$x^2$")
plt.plot(x, x_squared_der, label=r"$\frac{dx^2}{dx}$")
plt.legend();
plt.grid()

!这里写图片描述

Why that rocks

my_vector = tf.placeholder('float32', [None])
# Compute the gradient of the next weird function over my_scalar and my_vector
# Warning! Trying to understand the meaning of that function may result in permanent brain damage
weird_psychotic_function = tf.reduce_mean((my_vector+my_scalar)**(1+tf.nn.moments(my_vector,[0])[1]) + 1./ tf.atan(my_scalar))/(my_scalar**2 + 1) + 0.01*tf.sin(2*my_scalar**1.5)*(tf.reduce_sum(my_vector)* my_scalar**2)*tf.exp((my_scalar-4)**2)/(1+tf.exp((my_scalar-4)**2))*(1.-(tf.exp(-(my_scalar-4)**2))/(1+tf.exp(-(my_scalar-4)**2)))**2der_by_scalar = tf.gradients(weird_psychotic_function, my_scalar)
der_by_vector = tf.gradients(weird_psychotic_function, my_vector)
# Plotting the derivative
scalar_space = np.linspace(1, 7, 100)y = [s.run(weird_psychotic_function, {my_scalar:x, my_vector:[1, 2, 3]})for x in scalar_space]plt.plot(scalar_space, y, label='function')y_der_by_scalar = [s.run(der_by_scalar,{my_scalar:x, my_vector:[1, 2, 3]})for x in scalar_space]plt.plot(scalar_space, y_der_by_scalar, label='derivative')
plt.grid()
plt.legend();

这里写图片描述

y_guess = tf.Variable(np.zeros(2, dtype='float32'))
y_true = tf.range(1, 3, dtype='float32')
loss = tf.reduce_mean((y_guess - y_true + tf.random_normal([2]))**2) 
#loss = tf.reduce_mean((y_guess - y_true)**2) 
#loss = -tf.reduce_mean(y_true * tf.log(y_guess) + (1-y_true) * tf.log(1-y_guess)) 
optimizer = tf.train.MomentumOptimizer(0.01, 0.5).minimize(loss, var_list=y_guess)
from matplotlib import animation, rc
import matplotlib_utils
from IPython.display import HTML, display_htmlfig, ax = plt.subplots()
y_true_value = s.run(y_true)
level_x = np.arange(0, 2, 0.02)
level_y = np.arange(0, 3, 0.02)
X, Y = np.meshgrid(level_x, level_y)
Z = (X - y_true_value[0])**2 + (Y - y_true_value[1])**2
ax.set_xlim(-0.02, 2)
ax.set_ylim(-0.02, 3)
s.run(tf.global_variables_initializer())
ax.scatter(*s.run(y_true), c='red')
contour = ax.contour(X, Y, Z, 10)
ax.clabel(contour, inline=1, fontsize=10)
line, = ax.plot([], [], lw=2)def init():line.set_data([], [])return (line,)guesses = [s.run(y_guess)]def animate(i):s.run(optimizer)guesses.append(s.run(y_guess))line.set_data(*zip(*guesses))return (line,)anim = animation.FuncAnimation(fig, animate, init_func=init,frames=400, interval=20, blit=True)

!这里写图片描述

try:display_html(HTML(anim.to_html5_video()))
# In case the build-in renderers are unaviable, fall back to
# a custom one, that doesn't require external libraries
except RuntimeError:anim.save(None, writer=matplotlib_utils.SimpleMovieWriter(0.001))

Logistic regression

Your assignment is to implement the logistic regression

Plan:
* Use a shared variable for weights
* Use a matrix placeholder for X

We shall train on a two-class MNIST dataset
* please note that target y are {0,1} and not {-1,1} as in some formulae

from sklearn.datasets import load_digits
mnist = load_digits(2)X, y = mnist.data, mnist.targetprint("y [shape - %s]:" % (str(y.shape)), y[:10])
print("X [shape - %s]:" % (str(X.shape)))# input features is 64, number of examples is 360
y [shape - (360,)]: [0 1 0 1 0 1 0 0 1 1]
X [shape - (360, 64)]:
print('X:\n',X[:3,:10])
print('y:\n',y[:10])
plt.imshow(X[1].reshape([8,8]));
X:[[  0.   0.   5.  13.   9.   1.   0.   0.   0.   0.][  0.   0.   0.  12.  13.   5.   0.   0.   0.   0.][  0.   0.   1.   9.  15.  11.   0.   0.   0.   0.]]
y:[0 1 0 1 0 1 0 0 1 1]

!这里写图片描述

It’s your turn now!
Just a small reminder of the relevant math:

P(y=1|X)=σ(XW+b) P ( y = 1 | X ) = σ ( X ⋅ W + b )

loss=log(P(ypredicted=1))ytruelog(1P(ypredicted=1))(1ytrue) loss = − log ⁡ ( P ( y predicted = 1 ) ) ⋅ y true − log ⁡ ( 1 − P ( y predicted = 1 ) ) ⋅ ( 1 − y true )

σ(x) σ ( x ) is available via tf.nn.sigmoid and matrix multiplication via tf.matmul

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

Your code goes here. For the training and testing scaffolding to work, please stick to the names in comments.

# Model parameters - weights and bias
# weights = tf.Variable(...) shape should be (X.shape[1], 1)
# b = tf.Variable(...)#weights = tf.Variable(np.zeros((X.shape[1],1 ), dtype='float32'))# input features is 64
#b = tf.Variable(np.zeros((1,1), dtype='float32'))weights = tf.Variable(tf.random_normal(shape=[X.shape[1], 1], mean=0, stddev = 0.01))
b = tf.Variable(0.0)
s.run(tf.global_variables_initializer())
# Placeholders for the input data
# input_X = tf.placeholder(...)
# input_y = tf.placeholder(...)
input_X = tf.placeholder("float32", shape=(None,None), name="input_X")# the shape is none * none to more adaptive.这里需要写成None × None,后面可适应性强
input_y = tf.placeholder("float32", shape=(None,), name="input_y")
# The model code# Compute a vector of predictions, resulting shape should be [input_X.shape[0],]
# This is 1D, if you have extra dimensions, you can  get rid of them with tf.squeeze .
# Don't forget the sigmoid.
# predicted_y = <predicted probabilities for input_X>
predicted_y = tf.sigmoid(tf.matmul(input_X,weights) + b)
predicted_y = tf.squeeze(predicted_y)# Loss. Should be a scalar number - average loss over all the objects
# tf.reduce_mean is your friend here
# loss = <logistic loss (scalar, mean over sample)>loss = -tf.reduce_mean(input_y * tf.log(predicted_y) + (1-input_y)* tf.log(1-predicted_y))
print(loss.shape)
#optimizer = tf.train.MomentumOptimizer(0.01, 0.5).minimize(loss)
optimizer = tf.train.AdamOptimizer(learning_rate = 0.01,beta1=0.9,beta2=0.999,epsilon=1e-08,).minimize(loss)
# See above for an example. tf.train.*Optimizer
# optimizer = <optimizer that minimizes loss>

A test to help with the debugging

validation_weights = 1e-3 * np.fromiter(map(lambda x:s.run(weird_psychotic_function, {my_scalar:x, my_vector:[1, 0.1, 2]}),0.15 * np.arange(1, X.shape[1] + 1)),count=X.shape[1], dtype=np.float32)[:, np.newaxis]
# Compute predictions for given weights and bias
prediction_validation = s.run(predicted_y, {input_X: X,weights: validation_weights,b: 1e-1})# Load the reference values for the predictions
validation_true_values = np.loadtxt("validation_predictons.txt")assert prediction_validation.shape == (X.shape[0],),\"Predictions must be a 1D array with length equal to the number " \"of examples in input_X"
assert np.allclose(validation_true_values, prediction_validation)
loss_validation = s.run(loss, {input_X: X[:100],input_y: y[-100:],weights: validation_weights+1.21e-3,b: -1e-1})
assert np.allclose(loss_validation, 0.728689)
from sklearn.metrics import roc_auc_score
s.run(tf.global_variables_initializer())
for i in range(5):s.run(optimizer, {input_X: X_train, input_y: y_train})loss_i = s.run(loss, {input_X: X_train, input_y: y_train})print("loss at iter %i:%.4f" % (i, loss_i))print("train auc:", roc_auc_score(y_train, s.run(predicted_y, {input_X:X_train})))print("test auc:", roc_auc_score(y_test, s.run(predicted_y, {input_X:X_test})))

Coursera submission

grade_submitter = grading.Grader("BJCiiY8sEeeCnhKCj4fcOA")
test_weights = 1e-3 * np.fromiter(map(lambda x:s.run(weird_psychotic_function, {my_scalar:x, my_vector:[1, 2, 3]}),0.1 * np.arange(1, X.shape[1] + 1)),count=X.shape[1], dtype=np.float32)[:, np.newaxis]

First, test prediction and loss computation. This part doesn’t require a fitted model.

prediction_test = s.run(predicted_y, {input_X: X,weights: test_weights,b: 1e-1})
assert prediction_test.shape == (X.shape[0],),\"Predictions must be a 1D array with length equal to the number " \"of examples in X_test"
grade_submitter.set_answer("0ENlN", prediction_test)
loss_test = s.run(loss, {input_X: X[:100],input_y: y[-100:],weights: test_weights+1.21e-3,b: -1e-1})
# Yes, the X/y indices mistmach is intentional

这篇关于Introduction to Advanced Machine Learning, 第二周,Tensorflow-task(hse-aml/intro-to-dl,简单注释,答案,附图)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/264447

相关文章

csu 1446 Problem J Modified LCS (扩展欧几里得算法的简单应用)

这是一道扩展欧几里得算法的简单应用题,这题是在湖南多校训练赛中队友ac的一道题,在比赛之后请教了队友,然后自己把它a掉 这也是自己独自做扩展欧几里得算法的题目 题意:把题意转变下就变成了:求d1*x - d2*y = f2 - f1的解,很明显用exgcd来解 下面介绍一下exgcd的一些知识点:求ax + by = c的解 一、首先求ax + by = gcd(a,b)的解 这个

hdu2289(简单二分)

虽说是简单二分,但是我还是wa死了  题意:已知圆台的体积,求高度 首先要知道圆台体积怎么求:设上下底的半径分别为r1,r2,高为h,V = PI*(r1*r1+r1*r2+r2*r2)*h/3 然后以h进行二分 代码如下: #include<iostream>#include<algorithm>#include<cstring>#include<stack>#includ

usaco 1.3 Prime Cryptarithm(简单哈希表暴搜剪枝)

思路: 1. 用一个 hash[ ] 数组存放输入的数字,令 hash[ tmp ]=1 。 2. 一个自定义函数 check( ) ,检查各位是否为输入的数字。 3. 暴搜。第一行数从 100到999,第二行数从 10到99。 4. 剪枝。 代码: /*ID: who jayLANG: C++TASK: crypt1*/#include<stdio.h>bool h

uva 10387 Billiard(简单几何)

题意是一个球从矩形的中点出发,告诉你小球与矩形两条边的碰撞次数与小球回到原点的时间,求小球出发时的角度和小球的速度。 简单的几何问题,小球每与竖边碰撞一次,向右扩展一个相同的矩形;每与横边碰撞一次,向上扩展一个相同的矩形。 可以发现,扩展矩形的路径和在当前矩形中的每一段路径相同,当小球回到出发点时,一条直线的路径刚好经过最后一个扩展矩形的中心点。 最后扩展的路径和横边竖边恰好组成一个直

poj 1113 凸包+简单几何计算

题意: 给N个平面上的点,现在要在离点外L米处建城墙,使得城墙把所有点都包含进去且城墙的长度最短。 解析: 韬哥出的某次训练赛上A出的第一道计算几何,算是大水题吧。 用convexhull算法把凸包求出来,然后加加减减就A了。 计算见下图: 好久没玩画图了啊好开心。 代码: #include <iostream>#include <cstdio>#inclu

uva 10130 简单背包

题意: 背包和 代码: #include <iostream>#include <cstdio>#include <cstdlib>#include <algorithm>#include <cstring>#include <cmath>#include <stack>#include <vector>#include <queue>#include <map>

poj 3104 二分答案

题意: n件湿度为num的衣服,每秒钟自己可以蒸发掉1个湿度。 然而如果使用了暖炉,每秒可以烧掉k个湿度,但不计算蒸发了。 现在问这么多的衣服,怎么烧事件最短。 解析: 二分答案咯。 代码: #include <iostream>#include <cstdio>#include <cstdlib>#include <algorithm>#include <c

vscode中文乱码问题,注释,终端,调试乱码一劳永逸版

忘记咋回事突然出现了乱码问题,很多方法都试了,注释乱码解决了,终端又乱码,调试窗口也乱码,最后经过本人不懈努力,终于全部解决了,现在分享给大家我的方法。 乱码的原因是各个地方用的编码格式不统一,所以把他们设成统一的utf8. 1.电脑的编码格式 开始-设置-时间和语言-语言和区域 管理语言设置-更改系统区域设置-勾选Bata版:使用utf8-确定-然后按指示重启 2.vscode

消除安卓SDK更新时的“https://dl-ssl.google.com refused”异常的方法

消除安卓SDK更新时的“https://dl-ssl.google.com refused”异常的方法   消除安卓SDK更新时的“https://dl-ssl.google.com refused”异常的方法 [转载]原地址:http://blog.csdn.net/x605940745/article/details/17911115 消除SDK更新时的“

JAVA用最简单的方法来构建一个高可用的服务端,提升系统可用性

一、什么是提升系统的高可用性 JAVA服务端,顾名思义就是23体验网为用户提供服务的。停工时间,就是不能向用户提供服务的时间。高可用,就是系统具有高度可用性,尽量减少停工时间。如何用最简单的方法来搭建一个高效率可用的服务端JAVA呢? 停工的原因一般有: 服务器故障。例如服务器宕机,服务器网络出现问题,机房或者机架出现问题等;访问量急剧上升,导致服务器压力过大导致访问量急剧上升的原因;时间和