Advice for applying machine learning

编程入门行业动态更新时间:2024-10-22 18:44:06

Advice for <a href=https://www.elefans.com/category/jswz/34/1743012.html style= applying machine learning"/>

Advice for applying machine learning

摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程，第十一章《应用机器学习的建议》中第83课时《决定下一步做什么》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正，使其更加简洁，方便阅读，以便日后查阅使用。现分享给大家。如有错误，欢迎大家批评指正，在此表示诚挚地感谢！同时希望对大家的学习能有所帮助.
————————————————

By now you've seen a lot of different learning algorithms. And if you've been following along these videos, you should consider yourself an expert on many state-of-the-art machine learning techniques. But even among people that know a certain learning algorithm, there's often a huge difference between someone that really knows how to powerfully and effectively apply that algorithm, versus someone that's less familiar with some of the material that I'm about to teach and who doesn't really understand how to apply these algorithms and can end up wasting a lot of their time trying things out that don't really make sense. What I would like to do is make sure that if you are developing machine learning systems, that you know how to choose one of the most promising avenues to spend your time pursuing. And on this and the next few videos I'm going to give a number of practical suggestions, advice, guidelines on how to do that. And concretely what we'd focus on is the problem of suppose you are developing a machine learning system or trying to improve the performance of a machine learning system, how do you go about deciding what are the promising avenues to try next?

To explain this, let's continue using our example of learning to predict housing prices. And let's say you've implemented and regularized linear regression. Thus minimizing that cost function J. Now suppose that after you taking that learning parameters, if you test your hypothesis on the new set of houses, suppose you find that this is making huge errors in this prediction of the house prices. The question is what should you then try next in order to improve the learning algorithm? There are many things that one can think of that could improve the performance of the learning algorithm. One thing they could try is to get more training examples. And concretely, if you imagine, maybe, you know, setting up phone surveys, going door to door, to try to get more data on how much different houses sell for. And the sad thing is I've seen a lot of people spend a lot of time collecting more training examples, thinking if we have twice as much or ten times as much training data that is certainly going to help, right? But sometimes getting more training data doesn't actually help and in the next few videos we'll see why, and we'll see how you can avoid spending a lot of time collecting more training data in settings where it is just not going to help. Other things you might try are to well maybe try a smaller set of features. So if you have some set of features such as and so on, maybe a large number of features. Maybe you want to spend time carefully selecting some small subset of them to prevent overfitting. Or maybe you need to get additional features. Maybe the current set of features aren't informative enough and you want to collect more data in the sense of getting more features. And once again this is the sort of project that can scale up to the huge projects you can imagine getting phone surveys to find out more houses, or extra land surveys to find out more about the pieces of the land and so on, so a huge project. And once again, it would be nice to know in advance if this is going to help before we spend a lot of time doing something like this. We can also try adding polynomial features things like . We can still spend quite a lot of time thinking about that and we can also try other things like decreasing , the regularization parameter or increasing . Given a menu of options like these, some of which can easily scale up to six month or longer projects. Unfortunately, the most common method that people use to pick one of these is to go by gut feeling. In which what many people will do is sort of randomly pick one of these options and maybe say, "Oh, let's go and get more training data." And easily spend six months collecting more training data or maybe someone else would rather be saying, "well, let's go collect a lot more features on these houses in our data set." And I have a lot of time sadly seen people spend, you know, literally 6 months doing one of these avenues that they have picked sort of at random only to discover six months later that that really wasn't a promising avenue to pursue. Fortunately, there is a pretty simple technique that can let you very quickly rule out half of the things on this list as being potentially promising things to pursue. And there is a very simple technique, that if you run, can easily rule out many of these options, and potentially save you a lot of time pursuing something that's just not going to work. In the next two videos after this, I'm going to first talk about how to evaluate learning algorithms. And in the next few videos after that I'm going to talk about these techniques, which are called machine learning diagnostics.

And what a diagnostic is, is a test you can run, to get insight into what is or isn't working with an algorithm, and which will often give you insights as to what are promising things to try to improve a learning algorithm's performance. We'll talk about specific diagnostics later in this video sequence. But I should mention in advance that diagnostics can take time to implement and can sometimes, take quite a lot time to implement and understand, but doing so can be a very good use of your time when you're developing learning algorithms because they can often save you from spending many months pursuing an avenue that you could have found out much earlier just not going to be fruitful.

So in the next few videos, I'm going to talk about how to evaluate your learning algorithms. And after that, I'm going to talk about some of these diagnostics which will hopefully let you much more effectively select what are useful things to try next if your goal is to improve the performance of your machine learning system.

<end>

更多推荐

Advice for applying machine learning

本文发布于:2024-03-08 05:28:59，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1719943.html