并行主存系统解决了问题_使用并行和SignalR实时解决莎士比亚百万猴子问题

本文主要是介绍并行主存系统解决了问题_使用并行和SignalR实时解决莎士比亚百万猴子问题,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

并行主存系统解决了问题

并行主存系统解决了问题

A monkey with a skull. Oh yes.

A little over 18 months ago I was talking to Stephen Toub (he of the Parallel Computing fame) about parallelism and the kinds of problems it could solve.

大约18个月前,我正在和Stephen Toub (并行计算的名声)谈论并行性及其可以解决的各种问题。

I said, naively, "could we solve the million monkey's problem?"

我天真地说:“我们能解决百万只猴子的问题吗?”

He said, "the what?"

他说:“什么?”

"You know, if you have an infinite number of monkeys and an infinite number of keyboards they will eventually write Shakespeare."

“您知道,如果您有无数的猴子和无数的键盘,它们最终将编写莎士比亚作品。”

We brainstormed some ideas (since Stephen is a smarter than I, this consisted mostly of him gazing thoughtfully into the air while I sat on my hands) and eventually settled on an genetic algorithm. We would breed thousands of generations of (hypothetical) monkeys a second and then choose which ones would be allowed to perpetuate the species based solely on their ability to write Shakespeare.

我们集思广益,提出了一些想法(因为史蒂芬比我聪明,这主要是他坐在我手上时若有所思地凝视着天空),最终决定采用遗传算法。 我们将每秒繁殖数千代(假想的)猴子,然后仅根据它们的莎士比亚写作能力来选择允许哪些猴子永生。

We used the .NET 4 Task Parallel Library to make it easier for the algorithm to scale to available hardware. I mean, anyone can foreach over a million monkeys. But loops like that in parallel over 12 processors takes talent, right? Well, kinda. A lot of it is done for you by the Parallelism Features in .NET and that's the point. It's Parallel Processing for the Masses.

我们使用.NET 4任务并行库使该算法更容易扩展到可用的硬件。 我的意思是,任何人都可以超过一百万只猴子。 但是在12个以上的处理器上并行执行这样的循环需要人才,对吗? 好吧.NET中的并行功能为您完成了很多工作,这就是重点。 它是面向大众的并行处理。

We created a WinForms version of this application and I've used it on and off to demonstrate parallel computing on .NET. Then Paul Batum and I went to the Keeping It Realtime conference to demonstrate SignalR this last week. I didn't want to do the same "here's a real-time chat app" or "here's a map that shows its results in real-time" demos that one always does at these kinds of things. I suggested that we port our WinForms Shakespeare Monkey demo to ASP.NET and SignalR and that's what Paul proceeded to do.

我们创建了该应用程序的WinForms版本,并且已使用它来演示.NET上的并行计算。 然后Paul Batum和我上周参加了Keeping It Realtime会议来演示SignalR 。 我不想做同样的“总是在这种事情上做”的演示“这里是实时聊天应用程序”或“这里是实时显示其结果的地图”。 我建议我们将WinForms Shakespeare Monkey演示移植到ASP.NET和SignalR,这就是Paul继续做的事情。

Looks like 80,000 generations of monkeys

When doing something that is crazy computationally intensive but also needs to return real-time results you might think to use node for the real-time notification part and perhaps spawn off another process and use C or something for the maths and then have them talk to each others. We like node and you can totally run node on IIS or even write node in WebMatrix. However, node is good at some things and .NET is good at some things.

当做一些需要大量计算但又需要返回实时结果的疯狂事情时,您可能会考虑将node用于实时通知部分,并可能衍生出另一个进程并使用C或某些东西进行数学运算,然后让他们与之交谈每个人。 我们喜欢节点,您可以完全在IIS上运行节点,甚至可以在WebMatrix中编写节点。 但是,node擅长某些方面,而.NET则擅长某些方面。

For example, .NET is really good at CPU-bound computationally intensive stuff, like, I dunno, parallel matrix multiplication in F# or the like. ASP.NET is good at scaling web sites like Bing, or StackOverflow. You may not think IIS and ASP.NET when you think about real-time, but SignalR uses asynchronous handlers and smart techniques to get awesome scale when using long-polling and scales even more in our labs when using an efficient protocol like WebSockets with the new support for WebSockets in .NET 4.5.

例如,.NET确实非常擅长处理CPU密集型的计算密集型工作,例如I dunno,F#中的并行矩阵乘法等。 ASP.NET擅长扩展Bing或StackOverflow等网站。 在考虑实时性时,您可能不会想到IIS和ASP.NET,但SignalR使用异步处理程序和智能技术在使用长轮询时获得超棒的缩放比例,而在我们的实验室中使用像WebSockets这样的高效协议时,它的缩放比例甚至更高。 .NET 4.5中对WebSocket的新支持。

So, we wanted to see if you combined asynchronous background work, use as many processors as you have, get real-time status updates via SignalR over long-polling or Web Sockets, using C#, .NET 4.5, ASP.NET and IIS.

因此,我们想看看您是否组合了异步后台工作,使用了尽可能多的处理器,是否使用C#、. NET 4.5,ASP.NET和IIS通过长轮询或Web套接字通过SignalR获得实时状态更新。

It takes about 80,000 generations of monkeys at thousands of monkey generations a second (there's 200 monkeys per generation) to get the opening line of Hamlet. So that's ~16,000,000 monkeys just to get this much text. As they say, that's a lot of monkeys.

要获得“哈姆雷特”的开赛路线,每秒钟需要成千上万的猴子世代(每个世代有200只猴子)需要大约80,000代猴子。 这样一来,大约有16,000,000只猴子就得到了这么多文字。 正如他们所说,那是很多猴子。

Here's the general idea of the app. The client is pretty lightweight and quite easy. There's two boxes, two buttons and a checkbox along side some text. There's some usual event wireup with started, cancelled, complete, and updateProgress, but see how those are on a monkey variable? That's from $.connection.monkeys. It could be $.connection.foo, of course, as long as it's hanging off $.connection.

这是该应用程序的基本概念。 客户端非常轻巧,非常容易。 一些文本旁边有两个框,两个按钮和一个复选框。 有一些通常的事件联结,包括启动,取消,完成和updateProgress,但看看它们在猴子变量上如何? 那是来自$ .connection.monkeys的。 当然,它可以是$ .connection.foo,只要它挂在$ .connection上即可。

Those functions are client side but we raise them from the server over the persistent connection then update some text.

这些功能是客户端,但是我们通过持久连接从服务器上提出它们,然后更新一些文本。

<script src="Scripts/jquery-1.6.4.min.js"></script>    
<script src="Scripts/json2.min.js"></script>
<script src="Scripts/jquery.signalR.min.js"></script>
<script src="signalr/hubs"></script>
<script>
$(function () {
$('#targettext').val('To be or not to be, that is the question;\nWhether \'tis nobler in the mind to suffer\n\The slings and arrows of outrageous fortune,\n\Or to take arms against a sea of troubles,\n\And by opposing, end them.');

var monkeys = $.connection.monkeys,
currenttext = $('#currenttext'),
generationSpan = $('#generation'),
gpsSpan = $('#gps');

monkeys.updateProgress = function (text, generation, gps) {
currenttext.val(text);
generationSpan.text(generation);
gpsSpan.text(gps);
};

monkeys.started = function (target) {
$('#status').text('Working...');
$('#targettext').val(target);
$('#cancelbutton').removeAttr('disabled');
};

monkeys.cancelled = function () {
$('#status').text('Cancelled');
$('#cancelbutton').attr('disabled', 'disabled');
};

monkeys.complete = function () {
$('#status').text('Done');
$('#cancelbutton').attr('disabled', 'disabled');
};

$.connection.hub.start({}, function () {
$('#startbutton').click(function (event) {
$('#status').text('Queued...');
monkeys.startTyping($('#targettext').val(), $('#isparallel').is(':checked'));
});

$('#cancelbutton').click(function (event) {
monkeys.stopTyping();
});
});

});
</script>

The magic start with $.connection.hub.start. The hub client-side code is actually inside ~/signalr/hubs. See how that's include a the top? That client-side proxy is generated based on what hub or hubs are on the server side.

魔术始于$ .connection.hub.start。 集线器客户端代码实际上位于〜/ signalr / hubs内部。 看看上面有顶吗? 该客户端代理是基于服务器端上的一个或多个集线器生成的。

The server side is structured like this:

服务器端的结构如下:

[HubName("monkeys")]
public class MonkeyHub : Hub
{
public void StartTyping(string targetText, bool parallel)
{
}

public void StopTyping()
{
}

}

The StartTyping and StopTyping .NET methods are callable from the client-side via the monkeys JavaScript object. So you can call server-side C# from the client-side JavaScript and from the C# server you can call methods in JavaScript on the client. It'll make the most sense if you debug it and watch the traffic on the wire. The point is that C# and Json objects can flow back and forth which blurs the line nicely between client and server. It's all convention over configuration. That's how we talk between client and server. Now, what about those monkeys?

可以通过猴子JavaScript对象从客户端调用StartTyping和StopTyping .NET方法。 因此,您可以从客户端JavaScript调用服务器端C#,并且可以从C#服务器在客户端上JavaScript中调用方法。 如果您调试它并观察网络上的流量,则将是最有意义的。 关键是C#和Json对象可以来回流动,这很好地模糊了客户端和服务器之间的界线。 一切都超过了配置。 这就是我们在客户端和服务器之间进行交谈的方式。 现在,那些猴子呢?

You can check out the code in full, but StartTyping is the kick off point. Note how it's reporting back to the Hub (calling back to the client) constantly. Paul is using Hub.GetClients to talk to all connected clients as broadcast. This current implementation allows just one monkey job at a time. Other clients that connect will see the job in progress.

您可以完整检查代码,但是StartTyping是启动点。 请注意它如何不断地向集线器报告(回呼给客户端)。 Paul正在使用Hub.GetClients与所有已连接的客户端进行广播对话。 当前的实现一次只允许一个猴子工作。 连接的其他客户端将看到正在进行的作业。

public void StartTyping(string targetText, bool parallel)
{
var settings = new GeneticAlgorithmSettings { PopulationSize = 200 };
var token = cancellation.Token;


currentTask = currentTask.ContinueWith((previous) =>
{
// Create the new genetic algorithm
var ga = new TextMatchGeneticAlgorithm(parallel, targetText, settings);
TextMatchGenome? bestGenome = null;
DateTime startedAt = DateTime.Now;

Hub.GetClients<MonkeyHub>().started(targetText);

// Iterate until a solution is found or until cancellation is requested
for (int generation = 1; ; generation++)
{
if (token.IsCancellationRequested)
{
Hub.GetClients<MonkeyHub>().cancelled();
break;
}

// Move to the next generation
ga.MoveNext();

// If we've found the best solution thus far, update the UI
if (bestGenome == null ||
ga.CurrentBest.Fitness < bestGenome.Value.Fitness)
{
bestGenome = ga.CurrentBest;

int generationsPerSecond = generation / Math.Max(1, (int)((DateTime.Now - startedAt).TotalSeconds));
Hub.GetClients<MonkeyHub>().updateProgress(bestGenome.Value.Text, generation, generationsPerSecond);

if (bestGenome.Value.Fitness == 0)
{
Hub.GetClients<MonkeyHub>().complete();
break;
}
}
}
}, TaskContinuationOptions.OnlyOnRanToCompletion);
}

If he wanted, he could use this.Caller to communicate with the specific client that called StartTyping. Inside ga.MoveNext we make the decision to go parallel or not based on that checkbox. This is where we pick two random high quality parent monkeys from our population for a potential future monkey. Hopefully one whose typing looks more like Shakespeare and less like a Regular Expression.

如果他愿意,他可以使用this.Caller与名为StartTyping的特定客户端进行通信。 在ga.MoveNext内部,我们根据该复选框来决定是否并行。 在这里,我们从种群中选择了两只随机的高质量亲本猴子,作为潜在的未来猴子。 希望它的键入看起来更像莎士比亚,而不像正则表达式。

By simply changing from Enumerable.Range to ParallelEnumerable.Range we can start taking easily parallelizable things and using all the processors on our machine. Note the code is the same otherwise.

通过简单地从Enumerable.Range更改为ParallelEnumerable.Range,我们可以开始处理可并行化的事情,并使用计算机上的所有处理器。 注意,否则代码是相同的。

private TextMatchGenome[] CreateNextGeneration()
{
var maxFitness = _currentPopulation.Max(g => g.Fitness) + 1;
var sumOfMaxMinusFitness = _currentPopulation.Sum(g => (long)(maxFitness - g.Fitness));

if (_runParallel)
{
return (from i in ParallelEnumerable.Range(0, _settings.PopulationSize / 2)
from child in CreateChildren(
FindRandomHighQualityParent(sumOfMaxMinusFitness, maxFitness),
FindRandomHighQualityParent(sumOfMaxMinusFitness, maxFitness))
select child).
ToArray();
}
else
{
return (from i in Enumerable.Range(0, _settings.PopulationSize / 2)
from child in CreateChildren(
FindRandomHighQualityParent(sumOfMaxMinusFitness, maxFitness),
FindRandomHighQualityParent(sumOfMaxMinusFitness, maxFitness))
select child).
ToArray();
}
}

My 12 proc desktop does about 3800 generations a second in parallel.

我的12 proc台式机每秒并行执行约3800代。

Big thanks to Paul for the lovely port of this to SignalR and to Stephen Toub for the algorithm. 

非常感谢Paul将其很好地移植到SignalR,并感谢Stephen Toub的算法。

The code for the SignalR monkeys demo is on my BitBucket. Right now it needs .NET 4.5 and the Visual Studio Developer Preview, but you could remove a few lines and get it working on .NET 4, no problem.

SignalR猴子演示的代码在我的BitBucket上。 现在它需要.NET 4.5和Visual Studio Developer Preview,但是您可以删除几行并使它在.NET 4上正常工作。

Note that that SignalR works on .NET 4 and up and you can play with it today. You can even chat with the developers in the SignalR chat app in the 'aspnet' room at http://chatapp.apphb.com. Just /nick yourself then /join aspnet.

请注意, SignalR可在.NET 4及更高版本上运行,您可以立即使用它。 您甚至可以在http://chatapp.apphb.com的“ aspnet”房间中的SignalR聊天应用程序中与开发人员聊天。 只需/ nick自己,然后/ join aspnet。

No monkeys were hurt in the writing of this blog post.

在撰写此博客文章时,没有猴子受伤。

翻译自: https://www.hanselman.com/blog/solving-the-shakespeare-million-monkeys-problem-in-realtime-with-parallelism-and-signalr

并行主存系统解决了问题

这篇关于并行主存系统解决了问题_使用并行和SignalR实时解决莎士比亚百万猴子问题的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/370060

相关文章

Java学习手册之Filter和Listener使用方法

《Java学习手册之Filter和Listener使用方法》:本文主要介绍Java学习手册之Filter和Listener使用方法的相关资料,Filter是一种拦截器,可以在请求到达Servl... 目录一、Filter(过滤器)1. Filter 的工作原理2. Filter 的配置与使用二、Listen

Pandas使用AdaBoost进行分类的实现

《Pandas使用AdaBoost进行分类的实现》Pandas和AdaBoost分类算法,可以高效地进行数据预处理和分类任务,本文主要介绍了Pandas使用AdaBoost进行分类的实现,具有一定的参... 目录什么是 AdaBoost?使用 AdaBoost 的步骤安装必要的库步骤一:数据准备步骤二:模型

Spring Boot中JSON数值溢出问题从报错到优雅解决办法

《SpringBoot中JSON数值溢出问题从报错到优雅解决办法》:本文主要介绍SpringBoot中JSON数值溢出问题从报错到优雅的解决办法,通过修改字段类型为Long、添加全局异常处理和... 目录一、问题背景:为什么我的接口突然报错了?二、为什么会发生这个错误?1. Java 数据类型的“容量”限制

使用Pandas进行均值填充的实现

《使用Pandas进行均值填充的实现》缺失数据(NaN值)是一个常见的问题,我们可以通过多种方法来处理缺失数据,其中一种常用的方法是均值填充,本文主要介绍了使用Pandas进行均值填充的实现,感兴趣的... 目录什么是均值填充?为什么选择均值填充?均值填充的步骤实际代码示例总结在数据分析和处理过程中,缺失数

如何使用 Python 读取 Excel 数据

《如何使用Python读取Excel数据》:本文主要介绍使用Python读取Excel数据的详细教程,通过pandas和openpyxl,你可以轻松读取Excel文件,并进行各种数据处理操... 目录使用 python 读取 Excel 数据的详细教程1. 安装必要的依赖2. 读取 Excel 文件3. 读

关于MongoDB图片URL存储异常问题以及解决

《关于MongoDB图片URL存储异常问题以及解决》:本文主要介绍关于MongoDB图片URL存储异常问题以及解决方案,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐... 目录MongoDB图片URL存储异常问题项目场景问题描述原因分析解决方案预防措施js总结MongoDB图

SpringBoot项目中报错The field screenShot exceeds its maximum permitted size of 1048576 bytes.的问题及解决

《SpringBoot项目中报错ThefieldscreenShotexceedsitsmaximumpermittedsizeof1048576bytes.的问题及解决》这篇文章... 目录项目场景问题描述原因分析解决方案总结项目场景javascript提示:项目相关背景:项目场景:基于Spring

解决Maven项目idea找不到本地仓库jar包问题以及使用mvn install:install-file

《解决Maven项目idea找不到本地仓库jar包问题以及使用mvninstall:install-file》:本文主要介绍解决Maven项目idea找不到本地仓库jar包问题以及使用mvnin... 目录Maven项目idea找不到本地仓库jar包以及使用mvn install:install-file基

最详细安装 PostgreSQL方法及常见问题解决

《最详细安装PostgreSQL方法及常见问题解决》:本文主要介绍最详细安装PostgreSQL方法及常见问题解决,介绍了在Windows系统上安装PostgreSQL及Linux系统上安装Po... 目录一、在 Windows 系统上安装 PostgreSQL1. 下载 PostgreSQL 安装包2.

Python使用getopt处理命令行参数示例解析(最佳实践)

《Python使用getopt处理命令行参数示例解析(最佳实践)》getopt模块是Python标准库中一个简单但强大的命令行参数处理工具,它特别适合那些需要快速实现基本命令行参数解析的场景,或者需要... 目录为什么需要处理命令行参数?getopt模块基础实际应用示例与其他参数处理方式的比较常见问http