The goal of my tryout is to analyze all links on a web page, and download all images I’m interested on those linked web pages. So both CPU computation and IO operation are involved. It is quite easy to use DownloadString and DownloadFile of WebClient to download html content and images of those web pages. Naturally IEnumerable<T> is used to define methods (like “static IEnumerable<string> GetImagePaths(string address)”) of retrieving address from given uri, and CPU computation is mixed with IO operation. As you can image, the program is kind of frozen when it is running and many CPU cycles are wasted on waiting IO completion.
static IEnumerable<string> GetImagePaths(string address)
{
WebClient client = new WebClient();
var uri = new Uri(address);
var content = client.DownloadString(uri);
//analyze html content and get path of images
foreach (var path in GetJpgImagePaths(content))
{
yield return path;
}
}
Read more: Junfeng Dai's Blog
QR: