Four steps to master machine learning with python (including free books amp;amp; resources)

本文主要是介绍Four steps to master machine learning with python (including free books amp;amp; resources),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

To understand and apply machine learning techniques you have to learn Python or R. Both are programming languages similar to C, Java or PHP. However, since Python and R are much younger and “farer away” from the CPU, they are easier. The advantage of Python is that it can be adopted to many other problems than R, which is only used for handling data, analysing it with e.g. machine learning and statistic algorythms and ploting it in nice graphs. Because Python has a broader distribution (hosting websites with Jango, natural language proecssing, accessing APIs of websites such as Twitter, Linkedin etc.) and resembles more classical programming languages like C Python is more popular.

The four steps of learning machine learning in python

  1. First you have to learn the basics of Python using books, courses and videos.
  2. Then you have to master the different moduls such as Pandas, Numpy, Matplotlib and Natural Language Processing (NLP) in order to handle, clean, plot and understand data.
  3. Afterwards you have to able to scrap data from the web which is either done by using APIs of websites or the web-scraping moduls Beautiful Soup. Web scraping allows you to collect data which you feed into you machine learning algorithms.
  4. In the last step you have to learn machine learning (ML) tools like Scikit-Learn or implement ML-algorithm from scratch.

1. Getting started with Python:

And easy and fast way to learn Python is to register at codecademy.com and imediately start to code and learn the basics of python. A classic is the website learnpythonthehardway which is referenced by a lot of python programmers. A good PDF is a byte of python. A list of python resources for beginners is also provided by the python community. A book from O’Reilley is Think Python, which can be downloaded for free from here. A last resource is Introduction to Python for Econometrics, Statistics and Data Analysis which also covers the basics of Python.

2. Important Modules for machine learning

The most important modules for machine learning are NumPy, Pandas, Matplotlib and IPython. A book covering a couple of these modules is Data Analysis with Open Source Tools.  The free book Introduction to Python for Econometrics, Statistics and Data Analysis from 1. also covers Numpy, Pandas, matplotlib and IPython. Another resource is Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, which also covers the most important modules. Her are other free Numpy (Numerical Python, Numpy Userguide, Guide to NumPy), Pandas (Pandas, Powerful Python Data Analysis Toolkit, Practical Business Python, Intros to Pandas Data Structure) and Matplotlib books.

Other resources:

  • 10 minutes to Pandas
  • Pandas for machine learning
  • 100 NumPy exercises

3. Mining and scraping the data from websites and through APIs

Once you have understood the basics of python and the most important modules you have to learn how to collect data from different sources. This technique is also called web scrapping. Classic sources are text from websites, textual data through APIs to access websites such as twitter or linkedin. Good books on web scraping are Mining the Social Web (free book!), Web Scraping with Python and Web Scraping with Python: Collecting Data from the Modern Web. 

Lastly this textual data has to be transformed into numerical data, which is done with natural language processing techniques covered by Natural language processing with Python and Natural Language Annotation for Machine Learning. Other data are images and videos, which can be analysed using computer vision techniques: Programming Computer Vision with Python, Programming Computer Vision with Python: Tools and algorithms for analyzing images  and Practical Python and OpenCV are typical resources to analyse images.

Educational and interesting examples of what you can already do using basic python commands and web scraping techniques can be found in these examples:

  • Mini-Tutorial: Saving Tweets to a Database with Python
  • Web Scraping Indeed for Key Data Science Job Skills
  • Case Study: Sentiment Analysis On Movie Reviews
  • First Web Scraper
  • Sentiment Analysis of Emails
  • Simple Text Classification
  • Basic Sentiment Analysis with Python
  • Twitter sentiment analysis using Python and NLTK
  • Second Try: Sentiment Analysis in Python
  • Natural Language Processing in a Kaggle Competition for Movie Reviews

4. Machine learning with Python

Machine learning can be divided into four groups. Classification, clustering, regression and dimensionalty reduction.

drop_shadows_background2

 

Classification can also be called supervised learning and helps one to classify an image in order to identify a symbol or face in the image, or to classify a user from its profile and to grant him different credit scores. Clustering happens under unsupervised learning and allows the user to identify groups/clusters within its data. Regression permits to estimate a value from a paramter set and can be used to predict the best price for a house, apartment or car.

All important modules, packages and techniques to learn Machine Learning in Python, C, Scala, Java, Julia, MATLAB, Go, R and Ruby. Books about machine learning in python:

I especially recommend the book Machine learning in action. Although a bit short it is probably a classic in machine learning due to its age Programming Collective Intelligence. These two books let you build machine learning algorithms from scratch.

Most recent publications about machine learning are base on the Python module scikit-learn. It makes machine learning very easy since all the algorithm are already implemented. The only thing you do is to tell python which ML-technique should be used to analyse the data.

A free scikit-learn tutorial can be found on the official scikit-learn website. Other posts are be found here:

  • Introduction to Machine Learning with Python and Scikit-Learn
  • Data Science in Python
  • Machine Learning for Predicting Bad Loans
  • A Generic Architecture for Text Classification with Machine Learning
  • Using Python and AI to predict types of wine
  • Advice for applying Machine Learning
  • Predicting customer churn with scikit-learn
  • Mapping Your Music Collection
  • Data Science in Python
  • Case Study: Sentiment Analysis on Movie Reviews
  • Document Clustering with Python
  • Five most popular similarity measures implementation in python
  • Case Study: Sentiment Analysis on Movie Reviews
  • Will it Python?
  • Text Processing in Machine Learning
  • Hacking an epic NHL goal celebration with a hue light show and real-time machine learning
  • Vancouver Room Prices
  • Exploring and Predicting University Faculty Salaries
  • Predicting Airline Delays

Books about machine learning and the module scikit-learn in Python are:

  • Collection of books on reddit
  • Building Machine Learning Systems with Python
  • Building Machine Learning Systems with Python, 2nd Edition
  • Learning scikit-learn: Machine Learning in Python
  • Machine Learning Algorithmic Perspective
  • Data Science from Scratch – First Principles with Python
  • Machine Learning in Python

Books which are published in the coming months are:

  • Introduction to Machine Learning with Python
  • Thoughtful Machine Learning with Python: A Test-Driven Approach

Courses and blogs about Machine learning

You want to earn a degree, take an online course or attand a real workshop, camp or university course? Here are some links: Collection of links to online education in analytics, Big Data, Data Mining, and Data Science. Coursera course in machine learning and Data Analyst Nanodegree from Udacity are other recommended online courses. List of frequently updated blogs about machine learning.

A great youtube video is this class from Jake Vanderplas, Olivier Grisel about Exploring Machine Learning with Scikit-learn!

Theory of Machine Learning

Want to learn the theory of machine learning? The Elements of statistical Learning and Introduction to Statistical Learning are often cited classics. Other books are Introduction to machine learning and A Course in Machine Learning. The links contain free PDF, so you don’t have to pay them! Don’t want to read this? Watch 15 hours theory of machine learning!

原文地址:http://lernpython.de/four-steps-to-master-machine-learning-with-python-including-free-books-resources

翻译底子:   http://python.jobbole.com/84326/

这篇关于Four steps to master machine learning with python (including free books amp;amp; resources)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/581840

相关文章

python: 多模块(.py)中全局变量的导入

文章目录 global关键字可变类型和不可变类型数据的内存地址单模块(单个py文件)的全局变量示例总结 多模块(多个py文件)的全局变量from x import x导入全局变量示例 import x导入全局变量示例 总结 global关键字 global 的作用范围是模块(.py)级别: 当你在一个模块(文件)中使用 global 声明变量时,这个变量只在该模块的全局命名空

【Python编程】Linux创建虚拟环境并配置与notebook相连接

1.创建 使用 venv 创建虚拟环境。例如,在当前目录下创建一个名为 myenv 的虚拟环境: python3 -m venv myenv 2.激活 激活虚拟环境使其成为当前终端会话的活动环境。运行: source myenv/bin/activate 3.与notebook连接 在虚拟环境中,使用 pip 安装 Jupyter 和 ipykernel: pip instal

【机器学习】高斯过程的基本概念和应用领域以及在python中的实例

引言 高斯过程(Gaussian Process,简称GP)是一种概率模型,用于描述一组随机变量的联合概率分布,其中任何一个有限维度的子集都具有高斯分布 文章目录 引言一、高斯过程1.1 基本定义1.1.1 随机过程1.1.2 高斯分布 1.2 高斯过程的特性1.2.1 联合高斯性1.2.2 均值函数1.2.3 协方差函数(或核函数) 1.3 核函数1.4 高斯过程回归(Gauss

【学习笔记】 陈强-机器学习-Python-Ch15 人工神经网络(1)sklearn

系列文章目录 监督学习:参数方法 【学习笔记】 陈强-机器学习-Python-Ch4 线性回归 【学习笔记】 陈强-机器学习-Python-Ch5 逻辑回归 【课后题练习】 陈强-机器学习-Python-Ch5 逻辑回归(SAheart.csv) 【学习笔记】 陈强-机器学习-Python-Ch6 多项逻辑回归 【学习笔记 及 课后题练习】 陈强-机器学习-Python-Ch7 判别分析 【学

nudepy,一个有趣的 Python 库!

更多资料获取 📚 个人网站:ipengtao.com 大家好,今天为大家分享一个有趣的 Python 库 - nudepy。 Github地址:https://github.com/hhatto/nude.py 在图像处理和计算机视觉应用中,检测图像中的不适当内容(例如裸露图像)是一个重要的任务。nudepy 是一个基于 Python 的库,专门用于检测图像中的不适当内容。该

pip-tools:打造可重复、可控的 Python 开发环境,解决依赖关系,让代码更稳定

在 Python 开发中,管理依赖关系是一项繁琐且容易出错的任务。手动更新依赖版本、处理冲突、确保一致性等等,都可能让开发者感到头疼。而 pip-tools 为开发者提供了一套稳定可靠的解决方案。 什么是 pip-tools? pip-tools 是一组命令行工具,旨在简化 Python 依赖关系的管理,确保项目环境的稳定性和可重复性。它主要包含两个核心工具:pip-compile 和 pip

HTML提交表单给python

python 代码 from flask import Flask, request, render_template, redirect, url_forapp = Flask(__name__)@app.route('/')def form():# 渲染表单页面return render_template('./index.html')@app.route('/submit_form',

Python QT实现A-star寻路算法

目录 1、界面使用方法 2、注意事项 3、补充说明 用Qt5搭建一个图形化测试寻路算法的测试环境。 1、界面使用方法 设定起点: 鼠标左键双击,设定红色的起点。左键双击设定起点,用红色标记。 设定终点: 鼠标右键双击,设定蓝色的终点。右键双击设定终点,用蓝色标记。 设置障碍点: 鼠标左键或者右键按着不放,拖动可以设置黑色的障碍点。按住左键或右键并拖动,设置一系列黑色障碍点

Python:豆瓣电影商业数据分析-爬取全数据【附带爬虫豆瓣,数据处理过程,数据分析,可视化,以及完整PPT报告】

**爬取豆瓣电影信息,分析近年电影行业的发展情况** 本文是完整的数据分析展现,代码有完整版,包含豆瓣电影爬取的具体方式【附带爬虫豆瓣,数据处理过程,数据分析,可视化,以及完整PPT报告】   最近MBA在学习《商业数据分析》,大实训作业给了数据要进行数据分析,所以先拿豆瓣电影练练手,网络上爬取豆瓣电影TOP250较多,但对于豆瓣电影全数据的爬取教程很少,所以我自己做一版。 目

java线程深度解析(四)——并发模型(Master-Worker)

http://blog.csdn.net/daybreak1209/article/details/51372929 二、Master-worker ——分而治之      Master-worker常用的并行模式之一,核心思想是由两个进程协作工作,master负责接收和分配任务,worker负责处理任务,并把处理结果返回给Master进程,由Master进行汇总,返回给客