<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6614598184406322281</id><updated>2012-02-16T11:57:09.012-08:00</updated><category term='qemu-img'/><category term='High Performance Computing'/><category term='HADOOP Streaming'/><category term='graphics processing unit'/><category term='parallel computing'/><category term='internal cloud'/><category term='Simple Storage Systems'/><category term='processor chips'/><category term='Chromium OS'/><category term='solar power notebooks'/><category term='mobile phones'/><category term='software raid'/><category term='Generalized Extreme Value Distribution'/><category term='rdbms'/><category term='lvm'/><category term='cluster file system'/><category term='open source'/><category term='Apple'/><category term='big data'/><category term='power usage'/><category term='lustre'/><category term='cell phones'/><category term='Black Swan'/><category term='quad core'/><category term='Distributed Resource Manager'/><category term='nvida'/><category term='map reduce'/><category term='linear algebra'/><category term='jeopardy'/><category term='natural language processing'/><category term='Dell'/><category term='DRM'/><category term='cell processors'/><category term='mdadm'/><category term='Virtual Machines'/><category term='Africa'/><category term='Apache'/><category term='DNA genomics HPC hadoop'/><category term='EC2'/><category term='job scheduling'/><category term='Microsft'/><category term='monte carlo'/><category term='MPI'/><category term='volatility'/><category term='IBM'/><category term='facebook'/><category term='cooling'/><category term='4G'/><category term='simulation'/><category term='port trunking'/><category term='VDOOP'/><category term='incubator'/><category term='java'/><category term='wifi'/><category term='Torque'/><category term='vmware'/><category term='Amazon Elastic Compute Cloud'/><category term='netbooks'/><category term='Portable Batch System'/><category term='Stern School'/><category term='HPC'/><category term='Citrix'/><category term='venture capital'/><category term='hdfs'/><category term='Educause'/><category term='dragon naturally speaking'/><category term='mdadm resync'/><category term='XEN'/><category term='gpu'/><category term='NFS'/><category term='social networks'/><category term='military applications'/><category term='less developed countries'/><category term='virtual servers'/><category term='hardware raid'/><category term='hadoop performance'/><category term='hosts.deny'/><category term='bandwidth'/><category term='cluster computing'/><category term='Accelereyes'/><category term='private cloud'/><category term='iscsi'/><category term='grid computing'/><category term='HADOOP'/><category term='udev'/><category term='ubuntu'/><category term='linear programming'/><category term='assignment problem'/><category term='sas'/><category term='raid array'/><category term='heavy-tails'/><category term='Jacket'/><category term='Intel'/><category term='fft'/><category term='raid 6'/><category term='sandbox'/><category term='Wireless'/><category term='random number generation'/><category term='&quot;live migration&quot;'/><category term='search engines'/><category term='Open VPN'/><category term='Floating Point'/><category term='computing in higher education'/><category term='Sony PS3'/><category term='research computing'/><category term='ipad'/><category term='equallogic'/><category term='sata disk'/><category term='Splus'/><category term='64 bit linux'/><category term='snapshot'/><category term='grid processing'/><category term='3G'/><category term='Oracle Grid Engine'/><category term='gaussian'/><category term='yum upgrade'/><category term='sata drives'/><category term='haoop'/><category term='misting'/><category term='Nvidia'/><category term='bing'/><category term='disk storage'/><category term='Sun Computer'/><category term='business analytics'/><category term='firewall'/><category term='elastic compute cloud'/><category term='Virtualization'/><category term='kvm'/><category term='sofware raid'/><category term='Android'/><category term='Yahoo'/><category term='Windows Applications'/><category term='Watson'/><category term='tesla'/><category term='Centos'/><category term='san'/><category term='linux'/><category term='GEV'/><category term='Cuda'/><category term='PBS'/><category term='rackspace'/><category term='fermi hpc'/><category term='statistical computing'/><category term='cloud computing'/><category term='Storage arrays'/><category term='Amazon Elastic Cloud'/><category term='server cloning'/><category term='Microsoft Yahoo merger'/><category term='pnfs'/><category term='openfiler'/><category term='startup'/><category term='Super Computers'/><category term='ssh'/><category term='Sun Grid Engine'/><category term='Blue Cloud'/><category term='backups'/><category term='ओएप्शन'/><category term='S3'/><category term='hadoop nosql DB2 Oracle'/><category term='distributed computing'/><category term='hackers'/><category term='Google'/><category term='live migration'/><category term='fat-tails'/><category term='medical diagnosis'/><category term='matlab'/><category term='server farms'/><category term='options data'/><category term='business school'/><category term='remote computing'/><category term='microsoft'/><category term='qemu'/><category term='quotes'/><category term='amd'/><category term='iptables'/><category term='Nic bonding'/><title type='text'>Research Computing at Stern/NYU</title><subtitle type='html'>This blog will be used to share information about research computing in business schools. Hopefully other researchers around the world will provide some feedback.
I hope it is helpful. &lt;a href="http://www.stern.nyu.edu/~nwhite"&gt;Norman White&lt;/a&gt;, Faculty Director, &lt;a href="http://www.stern.nyu.edu/scrc"&gt;Stern Center for Research Computing &lt;/a&gt;</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>78</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-5974790479393697495</id><published>2011-12-05T13:53:00.000-08:00</published><updated>2011-12-05T13:53:37.731-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hadoop nosql DB2 Oracle'/><title type='text'>More Hadoop Users - Is it hype?</title><content type='html'>&lt;a href="http://www.informationweek.com/news/development/database/231902466?pgno=1"&gt;Information Week article on Big Data and Hadoop&lt;/a&gt; is yet another article that expands the list of companies and industries that are either using or experimenting with Hadoop and other NoSQL (Not Only SQL) solutions to big data problems.&lt;br /&gt;&lt;br /&gt;There is now a whole industry building around Hadoop and other similar systems. This looks like the adoption of SQL and relational data bases back in the 70's. IBM developed System R (1973?), and then out came DB2. Oracle copied the syntax and architecture and API of DB2 and soon we had Ingres, Informix, Sybase, ... Much later Microsoft joined with Access and SQL Server.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-5974790479393697495?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/5974790479393697495/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=5974790479393697495' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/5974790479393697495'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/5974790479393697495'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/12/more-hadoop-users-is-it-hype.html' title='More Hadoop Users - Is it hype?'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-9064611633344271774</id><published>2011-12-01T13:10:00.000-08:00</published><updated>2011-12-01T13:10:35.773-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DNA genomics HPC hadoop'/><title type='text'>Big Data and Genomics</title><content type='html'>An &lt;a href="http://www.nytimes.com/2011/12/01/business/dna-sequencing-caught-in-deluge-of-data.html?_r=1&amp;adxnnl=1&amp;ref=technology&amp;adxnnlx=1322773355-UxXvMJC8zG4grvmnpdtr2w"&gt;article &lt;/a&gt;in today's New York Times maintains that the world is now sequencing&lt;br /&gt;genomes several times faster than they can be analyzed. First there seems to be an error in the article, since it claims that a federal online archive now has  700 trillion DNA bases taking up 300 trillion bytes of storage. Wow, only about 20 bits per base. My guess is that the storage is more like 300 trillion megabytes.&lt;br /&gt;&lt;br /&gt;In any event, I think it is possible that the hardware they are using for sequencing can also be used for the analysis. If the hardware is a more or less standard HPC configuration, then it can also be used to run hadoop or some other "big data" architecture for the analysis. This would eliminate having to move the sequenced data to another location for analysis.&lt;br /&gt;&lt;br /&gt;Just a thought.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-9064611633344271774?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/9064611633344271774/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=9064611633344271774' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9064611633344271774'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9064611633344271774'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/12/big-data-and-genomics.html' title='Big Data and Genomics'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2591388745385886057</id><published>2011-10-28T09:48:00.000-07:00</published><updated>2011-11-01T07:08:16.961-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IBM'/><category scheme='http://www.blogger.com/atom/ns#' term='business analytics'/><category scheme='http://www.blogger.com/atom/ns#' term='Virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='haoop'/><category scheme='http://www.blogger.com/atom/ns#' term='Watson'/><category scheme='http://www.blogger.com/atom/ns#' term='big data'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='Oracle Grid Engine'/><title type='text'>Big Data and Business Analytics  comes of age</title><content type='html'>The world of enterprise computing is now starting to see the incorporation of new technologies that have been developed and adopted by the new media companies like Google, Yahoo, Facebook, LinkedIn  et al. These technologies include systems like large-scale distributed file systems like Hadoop, which can handle enormously large data sets, cloud computing and virtualization, html 5, smart mobile devices and many more.  We are now on the cusp of yet another revolution which will force many industries to change how they process their data and what products and services they can offer.. We are already seeing this in the new media world, where the dramatic rise in the generation and use of social media data and it's obvious value is rapidly changing (once again) how products are marketed and purchased. &lt;br /&gt;&lt;br /&gt;The analysis of the data generated by social media companies like Facebook, LinkedIN and the thousands of other related companies have forced companies to adopt new approaches to both gather, store and analyze  the data.  In addition, successful companies have to be able to scale their systems up and out faster than they can possibly handle on in-house equipment. This in turn has given rise to companies like Rackspace, Amazon and Google to develop architectures that allow the rapid deployment of very large amounts of computing power, essentially allowing companies to outsource all or part of their infrastructure so that they have on demand computing.  These changes will have a dramatic impact on both industry and society.&lt;br /&gt;&lt;br /&gt;The use of "Business Analytics" will spread rapidly as the knowledge of how to handle and analyze the huge amounts of data now available becomes mainstream. We have been seeing these changes over the last ten years, as first Google and then other  companies developed new approaches to handling and analyzing data. These approaches typically involve thousands or tens of thousands of computers that can be harnessed together to attack seemingly insolvable problems. Google, Yahoo and Facebook have datacenters of sizes that would have been beyond comprehension a decade ago.  The development of these immense computation centers has forced the computer industry to adapt and develop processors that are much "greener" than the previous generation, i.e. they use much less power and generate much less heat.  In addition, these data centers are now being built near power and cooling (i.e rivers and large bodies of water) so that their energy costs are reduced. (Try to imagine Google's electric bill).  The efficiency of these new monstrously large data centers means that we are seeing a move back to more centralized computing, where many companies may find it less expensive to outsource some of their computing to a cloud computing provider. This in turn raises a host of security, reliability, contractual and network related issues.    &lt;br /&gt;&lt;br /&gt;However, Big Data and Business Analytics technologies have now come of age. Within the last few months, both&lt;a href="http://www.informationweek.com/news/software/info_management/231901480"&gt; Oracle&lt;/a&gt; and&lt;a href="http://www.emc.com/about/news/press/2011/20110509-03.htm"&gt; EMC &lt;/a&gt;  have announced NOSQL solutions for unstructured data. This week, &lt;a href="http://www.computerworld.com/s/article/9220779/Microsoft_climbs_onto_Hadoop_bandwagon"&gt;Microsoft&lt;/a&gt; joined in by announcing that SQL Server 2012 will act as a front end to Hadoop. IBM has been supporting hadoop for years &lt;a href="http://www-01.ibm.com/software/data/bigdata/enterprise.html"&gt;(SEE)&lt;/a&gt;, and has incorporated it into it's Business Analytics practice. It also used hadoop as the back end for it's &lt;a href="http://developer.yahoo.com/blogs/hadoop/posts/2011/02/i%E2%80%99ll-take-hadoop-for-400-alex/"&gt;Watson&lt;/a&gt; system that won at Jeopardy a few months ago.&lt;br /&gt;&lt;br /&gt;But the recent announcements by EMC, Oracle and Microsoft will bring these technologies directly to the corporate environment and to many CIOs who have never heard of them.&lt;br /&gt;&lt;br /&gt;We are entering a new era. It will be interesting to see how fast these technologies are adopted by major corporations. Some companies, like Oracle and IBM, would seem to have an advantage. For instance, Oracle controls Java (hadoop is written in Java), Lustre (see previous post)  and has the ability to sell complete solutions including the hardware and software and consulting, IBM can also offer complete solutions.  &lt;br /&gt;&lt;br /&gt;The next announcement I expect to hear, is that the major consulting companies will announce practices centered around "big data". This could generate lot's of business for them.&lt;br /&gt;&lt;br /&gt;Stay tuned...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2591388745385886057?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2591388745385886057/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2591388745385886057' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2591388745385886057'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2591388745385886057'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/10/big-data-and-business-analytics-comes.html' title='Big Data and Business Analytics  comes of age'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7015960615498277211</id><published>2011-07-20T07:35:00.000-07:00</published><updated>2011-07-20T17:20:11.828-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='heavy-tails'/><category scheme='http://www.blogger.com/atom/ns#' term='GEV'/><category scheme='http://www.blogger.com/atom/ns#' term='fat-tails'/><category scheme='http://www.blogger.com/atom/ns#' term='Black Swan'/><category scheme='http://www.blogger.com/atom/ns#' term='gaussian'/><category scheme='http://www.blogger.com/atom/ns#' term='Generalized Extreme Value Distribution'/><title type='text'>Generalized Extreme Value Distribution and market prices</title><content type='html'>A recent &lt;a href="http://behind-the-enemy-lines.blogspot.com/"&gt;post&lt;/a&gt; by a colleague of mine, Panos Ipeirotis, on a newsweek article about the cost of mechanical turkers triggered some thinking about WHY many prices are often "heavy-tailed" and don't conform to a normal distribution. Panos pointed out errors in the methodology of the Newsweek study, since they didn't adjust for the size of turker populations in different countries. He pointed out that the prices that were being observed in their study would come from a Generalized Extreme Value Distribution, since they were the minimum bid in each country.&lt;br /&gt;&lt;br /&gt;I won't go into GEV in detail, but the basic premise is that if you draw an IID sample from a given distribution, and then look at the Extreme Values (maxima, minima, ordinals etc,) these will be distributed according to a GEV distribution, which is a family of distributions. These results are given by the  Fisher–Tippett–Gnedenko theorem. This theorem proves that there are three possible GEV distributions, (Gumbel, Frechet or Weibull), depending on the characteristics of underlying distribution. Panos' blog has a much more complete description.&lt;br /&gt;&lt;br /&gt;I don't want to go into the details here, but rather argue that we should expect most prices that are determined by market forces to be "heavy-tailed", especially whenever there is a market disturbance, or in the case of the Newsweek article a monopolistic or monopsonistic market.&lt;br /&gt;&lt;br /&gt;The argument (which I am sure can't be new, but is new to me) is that in many markets, especially the stock market, there are a number of bids and asks at different prices and size for any given asset at any point in time. The price that a trade will be executed at will be at the minimum or maximum of the bid or asks for obvious reasons. In a market in equilibrium, with a similar number buyers and sellers, the prices will hover around the expected value of the buyers and sellers, and hence (appear to be) distributed normally. (The mode of the Gumber distribution is the same as the mode of the normal from which maxima or minima is drawn)&lt;br /&gt;&lt;br /&gt;However, when there is an imbalance of buyers and sellers suddenly one side of the trade  is being executed at prices at the tail of the distribution (i.e. the  max bid or min ask price). These prices are being drawn from a GEV distribution (Gumbel if the underlying distribution is normal or log-normal).&lt;br /&gt;&lt;br /&gt;In these situations, volatility goes up (variance of the GEV &gt; underlying ??)and we can have rapid market movements (flash crashes etc.) that are multiple standard deviations away from the average price, i.e  6 sigma events, black swans etc.&lt;br /&gt;&lt;br /&gt;That these events occur is not disputed. Wall street practitioners have been using heavy-tailed distributions for years to better model price movements. What I haven't seen, is an argument as to why we would expect this to happen.&lt;br /&gt;&lt;br /&gt;In fact, Nassim Taleb, in  his book "The Black Swan",  gives countless examples of where the gaussian model breaks down, and describes his "dumbbell" investment practice of buying way out of the money options, because of inherent mispricing of the black-scholes model (assumes log normal price changes). When crashes come along like 2008, these investment approaches make lot's of money.&lt;br /&gt;&lt;br /&gt;What I haven't seen (and I haven't really looked) is a discussion of WHY probability and statistical theory would tell us to expect these heavy-tailed distributions. &lt;br /&gt;&lt;br /&gt;I now expect to see them in almost any type of market.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7015960615498277211?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7015960615498277211/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7015960615498277211' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7015960615498277211'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7015960615498277211'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/07/generalized-extreme-value-distribution.html' title='Generalized Extreme Value Distribution and market prices'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-4386430198168008553</id><published>2011-06-29T12:49:00.000-07:00</published><updated>2011-06-29T12:49:27.120-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='hdfs'/><category scheme='http://www.blogger.com/atom/ns#' term='lustre'/><category scheme='http://www.blogger.com/atom/ns#' term='san'/><title type='text'>Hadoop and SANs - Not just Lustre</title><content type='html'>As I thought about my last post, it became obvious that Lustre is just a special case of any kind of SAN-like solution that provides a large, flat file system to a number of hosts with very high performance. Any SAN-like solution could be used to hold the underlying data. The major change to hadoop would be that there would have to be a cluster-wide global file system that the hadoop-name-node controlled and gave individual slaves the current list of the blocks that they have access to.&lt;br /&gt;&lt;br /&gt;Again, dramatically reduced data movement. Current SAN IO Rates are in the 100s of thousands of IOPs per second and throughput rates of a gigabyte per second to individual servers. Think of a fast SAN with multiple 10gb ethernet nics. If a cluster got too large for one, one could have multiple SANs and indicate they are in different racks, so hadoop could keep replicas on another SAN.&lt;br /&gt;&lt;br /&gt;Many interesting possibilities. Need to look at some code.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-4386430198168008553?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/4386430198168008553/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=4386430198168008553' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4386430198168008553'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4386430198168008553'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/06/hadoop-and-sans-not-just-lustre.html' title='Hadoop and SANs - Not just Lustre'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8949514625195528127</id><published>2011-06-24T11:51:00.000-07:00</published><updated>2011-06-24T11:51:02.868-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='lustre'/><category scheme='http://www.blogger.com/atom/ns#' term='fermi hpc'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop performance'/><category scheme='http://www.blogger.com/atom/ns#' term='Oracle Grid Engine'/><title type='text'>Hadoop and Lustre - Some Thoughts</title><content type='html'>Hadoop has become  a major technology for new media companies that have to handle large textual data. Large hadoop clusters can have 1000's of nodes, making them to seem to be similar to a large HPC cluster. The difference is that in Hadoop clusters, the disk is local to each node, so that each node can process the data locally on that node. In HPC clusters, typically each node has limited disk space and most of the space is on a distributed, high performance file system like Lustre.&lt;br /&gt;&lt;br /&gt;Storing HDFS data locally, means that hadoop is less sensitive to network speeds, and usually runs well on a 1gb network. But what happens if you run hadoop and put all of it's data into a distributed file system like Lustre? There have been several recent tests in doing that, essentially turning all or part of an HPC cluster into an hadoop cluster. Oracle Grid Engine now offers Hadoop Scheduling as part of its product. The HOD (Hadoop on Demand) project, adds Hadoop support to the Torque grid scheduling system. Initial results of this approach were a little discouraging, since Hadoop on local disks seemed to outperform hadoop on a Lustre file system, although it still performed pretty well. However, early tests were not on production HPC clusters, but were on small test clusters running a 1gb ethernet backbone for the Lustre file system. This basically limited the Lustre performance to be about the same as an 80MB/sec sata disk.&lt;br /&gt;&lt;br /&gt;More recently &lt;a href="http://www.olcf.ornl.gov/wp-content/events/lug2011/4-12-2011/1100-1130_Nathan_Rutman_MapReduce_Lug_2011.pptx"&gt;tests&lt;/a&gt; have been done on an Infiniband network, which delivers much higher bandwidth. In this environment, Hadoop on Lustre starts to really perform, sometimes as much as 3 to 1 over local disks. I.E. the cluster only has to be 1/3 the size if it is running on an infiniband or other low latency, high bandwidth network fabric.&lt;br /&gt;&lt;br /&gt;But there may be even more performance to be squeezed out of a Hadoop/Lustre configuration. In a Lustre file system, basically all data can be accessed by any node at the same high speed, typically at least 1 GB /sec. Just shoving Hadoop on top of Lustre doesn't take advantage of the fact that Hadoop is still going to be MOVING lots of data from one node to another, albeit at a very high speed.&lt;br /&gt;&lt;br /&gt;This seems pretty stupid, since the data  is just being moved from one location on the Lustre file system to another. A much better solution is to just have the hadoop namenode just update the file tables of the data nodes, telling them what data is now "local" to them. In this way, the data movement would not actually occur, just the pointers to the data. A &lt;a href="http://arch.eece.maine.edu/superme/images/4/4b/Dunnmid.pdf"&gt;study&lt;/a&gt; by a student at the University of Maine hint at this approach but didn't actually try it. A &lt;a href="http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf"&gt;paper&lt;/a&gt; by interns at Sun actually tried it, but only on a 1gb network with a very small (2 OST) Lustre system&lt;br /&gt;&lt;br /&gt;In fact, this approach should eliminate the Hadoop Lustre performance bottlenecks on a 1gb network, since data movement would be dramatically reduced. In fact, in the Sun paper, even a small Lustre file system performed well on a 1gb network.&lt;br /&gt;&lt;br /&gt;Food for thought...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8949514625195528127?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8949514625195528127/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8949514625195528127' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8949514625195528127'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8949514625195528127'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/06/hadoop-and-lustre-some-thoughts.html' title='Hadoop and Lustre - Some Thoughts'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7727389521216858330</id><published>2011-04-25T07:37:00.000-07:00</published><updated>2011-04-25T07:37:23.956-07:00</updated><title type='text'>Amazon cloud outage</title><content type='html'>It is hard not to comment on the Amazon EC2 cloud outage last week, especially since so many people were affected. &lt;br /&gt;First, those sites that were knocked out had been gambling that the site they were  hosted on would never go down.  Not a good bet. They were almost all startups or relatively low budget companies that probably hand.t quite gotten around to solving redundancy issues. &lt;br /&gt;&lt;br /&gt;Amazon has facilities that will allow your site to be hosted in multiple data centers around the world. These companies had not take advantage of that, apparently not even as cold backup sites. You get what you pay for.&lt;br /&gt;&lt;br /&gt;However, even customers who had their systems in multiple availability zone got hammered, since it was apparently the Amazon backup procedures which had the problem, and ran multiple data centers out of disk space.&lt;br /&gt;&lt;br /&gt;Second, WHAT HAPPENED? Amazon has been very slow in getting news out as to what had happened, when  it was going to be fixed, andd what they will do to prevent this happening again.&lt;br /&gt;It certainly  seems like more than a hardware failure. IT feels more like a major software glitch, a hacker attack or human error.(the rm * command issued in the wrong directory).&lt;br /&gt;Hopefully, Amazon will eventually tell us what happened.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7727389521216858330?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7727389521216858330/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7727389521216858330' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7727389521216858330'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7727389521216858330'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/04/amazon-cloud-outage.html' title='Amazon cloud outage'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2041425445877545946</id><published>2011-04-18T07:50:00.000-07:00</published><updated>2011-04-19T12:59:46.757-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='misting'/><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='facebook'/><category scheme='http://www.blogger.com/atom/ns#' term='cooling'/><category scheme='http://www.blogger.com/atom/ns#' term='fermi hpc'/><category scheme='http://www.blogger.com/atom/ns#' term='big data'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='Apache'/><category scheme='http://www.blogger.com/atom/ns#' term='solar power notebooks'/><category scheme='http://www.blogger.com/atom/ns#' term='rackspace'/><category scheme='http://www.blogger.com/atom/ns#' term='power usage'/><title type='text'>Facebook joins Google in HPC Computing Architectures for Big Data</title><content type='html'>Facebook recently revealed how it uses custom designed servers by rackspace to build their data centers. Google was the first search company to develop an architecture and it.s own software systems for handling the huge amount of data generated by indexing the web. Their approach has become a standard in the Big Data world where data has to be always available anywhere in the world. Most data is write once and read in many places, and doesn't really fit into the standard relational data base methodologies. &lt;br /&gt;&lt;br /&gt;Instead, data is distributed and replicated across many thousands of servers which are in clusters around the world. Google pioneered this approach when they had to find a way to implement their page rank algorithm. The Google file system and their map reduce processing approach fits nicely into this environment. More recently, Yahoo has supported an open source version of the Google approach. This system is called Hadoop, and has recently been handed off to Apache as a top level Apache project. It is now the standard at many, many search and social media companies and, such as Yahoo, Facebook, Media 6,Digg, ... There are a variety of toolsets built on top of Hadoop, just as Google has a wide variety of systems built on top of it's proprietary system.&lt;br /&gt;&lt;br /&gt;One of the challenges of Big Data world of processing processing is the power consumption of those thousands of servers in each data center. The Facebook team has now unveiled it's hardware and networking architecture which takes advantage of newer processors which have sophisticated power capabilities, such that they can dynamically use less power when they are not heavily loaded. Custom designed 1.5U (instead of 1U) servers have extra large fans that cool more efficiently. They also have larger heat sinks.&lt;br /&gt;&lt;br /&gt;The whole data center has been designed for low power consumption, including cooling and power backup systems. The new data center design uses misting systems at the top level to cool air (if necessary) and lets the cool air drop down over the server instead of being forced up by fans.&lt;br /&gt;Cool idea!!&lt;br /&gt;&lt;br /&gt;Here is a graphic of their data center design.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.datacenterknowledge.com/wp-content/uploads/2011/04/opencomputer-datacenter-950.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="830" width="800" src="http://www.datacenterknowledge.com/wp-content/uploads/2011/04/opencomputer-datacenter-950.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;This becomes another example of where the problems of the "new media" sites are forcing out of the box thinking that are quite relevant to general and HPC computing. &lt;br /&gt;&lt;br /&gt;I look forward to more innovations like this from new media companies, that will eventually change how major corporations do their processing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2041425445877545946?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2041425445877545946/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2041425445877545946' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2041425445877545946'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2041425445877545946'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/04/facebook-joins-google-in-hpc-computing.html' title='Facebook joins Google in HPC Computing Architectures for Big Data'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8603711857150481468</id><published>2011-03-01T15:22:00.001-08:00</published><updated>2011-03-01T15:22:27.269-08:00</updated><title type='text'>You can now follow the blog on twitter</title><content type='html'>I think I have set up my blog, so that if you follow @nhwhite212 on twitter, you will get status updates on my new posts. In any event it should work soon.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8603711857150481468?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8603711857150481468/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8603711857150481468' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8603711857150481468'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8603711857150481468'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/03/you-can-now-follow-blog-on-twitter.html' title='You can now follow the blog on twitter'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-6407057013942307621</id><published>2011-02-19T13:59:00.000-08:00</published><updated>2011-02-19T13:59:07.073-08:00</updated><title type='text'>New hadoop coming - Yahoo drops hadoop support</title><content type='html'>http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/&lt;br /&gt;&lt;br /&gt;Yahoo is moving the whole hadoop project to be part of Apache.&lt;br /&gt;&lt;br /&gt;In addition, there is a new version of hadoop under development which will eliminate some of the problems in the current framework. One of the annoying things in using hadoop is that all updates come out as a new complete system which means the whole cluster has to be brought down and the new software installed. The new framwework will allow upgrades to be done while the cluster still is processing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-6407057013942307621?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/6407057013942307621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=6407057013942307621' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6407057013942307621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6407057013942307621'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/02/new-hadoop-coming-yahoo-drops-hadoop.html' title='New hadoop coming - Yahoo drops hadoop support'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-1183192627417625557</id><published>2011-02-19T13:52:00.000-08:00</published><updated>2011-02-19T13:52:39.265-08:00</updated><title type='text'>Watson and Hadoop</title><content type='html'>What I didn't realize until now is that Watson's database is in Hadoop. As those (few) of you that follow my blog know, I believe that the Google inspired distributed data base approach which uses the map-reduce framework will increasingly be applied to non-search problems. In the case of Watson, the hadoop database is very small by hadoop standards, only 500GB. But that data was spread across many, many processors, Supposedly 80 Teraflops of processing power.&lt;br /&gt;&lt;br /&gt;That is a huge amount of processing power for such a small data base, presumably necessary to get the response time down to the 3 second range necessary for jeopardy.&lt;br /&gt;&lt;br /&gt;In an interesting side-note, Google has recently moved away from the map-reduce model for some of their processing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-1183192627417625557?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/1183192627417625557/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=1183192627417625557' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1183192627417625557'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1183192627417625557'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/02/watson-and-hadoop.html' title='Watson and Hadoop'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-841296579707785222</id><published>2011-02-19T13:29:00.000-08:00</published><updated>2011-02-19T13:30:37.523-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='sandbox'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon Elastic Cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='startup'/><category scheme='http://www.blogger.com/atom/ns#' term='incubator'/><category scheme='http://www.blogger.com/atom/ns#' term='microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='64 bit linux'/><category scheme='http://www.blogger.com/atom/ns#' term='venture capital'/><title type='text'>A Sandbox for Startups at Stern/NYU</title><content type='html'>This is only tangentially related to research computing, but I have floated a proposal at the Stern School to start a sandbox for entrepreneurs who are doing technology based startups. The concept is fairly simple if one understands cloud computing. The school would run a small cloud for NYU students taking entrepreneurial and innovation classes. Groups that need an environment &lt;br /&gt;to develop a prototype of an idea could get started in the cloud for very little money. The cloud would have all of the usual suspects in terms of platforms and software applications. Windows servers, Windows 7 machines, all flavors of linux, the full Microsoft suite of applications, hadoop, and all of the open source applications. There would be short courses on different application approaches and where they are appropriate. &lt;br /&gt;&lt;br /&gt;Access would initially be limited to students in certain classes, but could eventually be opened up to the whole university and alumni. There would be some physical space that groups could meet in and interact with the other groups. One would hope that it would become self-sustaining with the students inviting speakers in. &lt;br /&gt;&lt;br /&gt;If successful, one would think that there would be support from the VC community. NYU already has an incubator that could take ideas to the next step.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The initial reactions seem to be very positive. Stay tuned.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-841296579707785222?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/841296579707785222/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=841296579707785222' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/841296579707785222'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/841296579707785222'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/02/sandbox-for-startups-at-sternnyu.html' title='A Sandbox for Startups at Stern/NYU'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7846927895659242095</id><published>2011-02-18T08:02:00.000-08:00</published><updated>2011-02-18T08:35:58.005-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='jeopardy'/><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='IBM'/><category scheme='http://www.blogger.com/atom/ns#' term='dragon naturally speaking'/><category scheme='http://www.blogger.com/atom/ns#' term='HPC'/><category scheme='http://www.blogger.com/atom/ns#' term='natural language processing'/><category scheme='http://www.blogger.com/atom/ns#' term='Watson'/><category scheme='http://www.blogger.com/atom/ns#' term='medical diagnosis'/><title type='text'>What is Watson??</title><content type='html'>&lt;a href="http://www.nytimes.com/2010/06/20/magazine/20Computer-t.html?ref=science"&gt;What is Watson?&lt;/a&gt;&lt;br /&gt;Is an overview of the machine that won jeopardy this week. This is certainly a breakthrough moment in artificial intelligence (AI). It shows just how far natural language processing has come. &lt;br /&gt;&lt;br /&gt;20 years ago, I was part of a team that evaluated several of IBM's leading natural language data base interfaces. Users would type in english queries, which would be translated into SQL and then executed.&lt;br /&gt;Both systems had some major flaws, and their performance was worse than users who just had minimal training in SQL. &lt;br /&gt;&lt;br /&gt;The domain of the queries was very limited, input was typed, and output was the result of a query. And even then the performance was poor.  &lt;br /&gt;&lt;br /&gt;Watson's input was spoken speech, the domain of the queries was enormous, and the answer was as a question in english. It clearly out performed two of the best jeopardy players in history. Watson and it's children can only get better. One can easily see systems like Watson being immediately (if not already) in use for military purposes, then large corporations, then small companies, and eventually available on your mobile device. (My iphone has much more processing power than the largest computer of the early 1970's, it can already understand spoken speech and translate it into text). &lt;br /&gt;&lt;br /&gt;Think of a world with server farms of Watsons available (think Google clusters).&lt;br /&gt;&lt;br /&gt;A simple iphone/android app would allow you to talk to watson on the backend. All the app needs is a a natural language front-end like Dragon Naturally Speaking, and a text to speech converter, or the answer could be sent as an audio / video file.&lt;br /&gt;&lt;br /&gt;IBM could probably do that right now, with very little effort. How much would you pay each month for an app that connects you to a Watson? Some people/companies would pay a lot. IBM  obviously has a huge investment in Watson. But "cloning" a Watson will cost a tiny fraction (although still expensive) amount of money. Or, they could have "baby" Watsons, with a fraction of the processing power and a limited data base, but focused on a particular vertical (like Pharma ).&lt;br /&gt;&lt;br /&gt;IBM's shares have gone from $115 to $160 in the last year. I think Watson has tremendous commercial potential, and IBM has no competition. The next few years should be very interesting. &lt;br /&gt;&lt;br /&gt;In fact, while I was writing this blog, IBM announced a research agreement with Nuance (owner of Dragon) to focus on the medical industry, where Dragon has a very large footprint already. Looks like low hanging fruit. &lt;a href=http://newenterprise.allthingsd.com/20110217/done-with-silly-game-shows-ibms-watson-finds-a-job/?mod=googlenews&gt; IBM/NUANCE Agreement &lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Don't be surprised to see more of these agreements in the next few months.&lt;br /&gt;&lt;br /&gt;Oh, and by the way. Even though you can't ask Watson questions quite yet, you can ask the Watson team questions about Watson.&lt;br /&gt;&lt;img src="http://i.i.com.com/cnwk.1d/i/tim/2011/02/17/Screen_shot_2011-02-17_at_3.35.44_PM_610x177.png"&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7846927895659242095?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7846927895659242095/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7846927895659242095' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7846927895659242095'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7846927895659242095'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/02/what-is-watson.html' title='What is Watson??'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-746326580073722836</id><published>2011-02-01T09:26:00.001-08:00</published><updated>2011-02-09T08:53:49.523-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='bing'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='microsoft'/><title type='text'>bing copies google?</title><content type='html'>http://news.cnet.com/8301-30685_3-20030206-264.html&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;- Posted using BlogPress from my iPad&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-746326580073722836?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/746326580073722836/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=746326580073722836' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/746326580073722836'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/746326580073722836'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/02/bing-copies-google.html' title='bing copies google?'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8207235612128882948</id><published>2011-01-21T17:18:00.000-08:00</published><updated>2011-02-01T09:21:27.535-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='research computing'/><category scheme='http://www.blogger.com/atom/ns#' term='live migration'/><category scheme='http://www.blogger.com/atom/ns#' term='Dell'/><category scheme='http://www.blogger.com/atom/ns#' term='vmware'/><category scheme='http://www.blogger.com/atom/ns#' term='kvm'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='san'/><category scheme='http://www.blogger.com/atom/ns#' term='iscsi'/><title type='text'>Clouds and Sun and SAN</title><content type='html'>All of the equipment for our new cloud arrived last week. To our delight, installation, configuration and deployment took very little time. The first equipment to arrive was the DELL Equallogic PS 4000E SAN. Really a baby SAN, with 8 slow SATA 7500 RPM 2TB disks. But, we had opted for supportability and VMWARE integration  over top end performance. In addition, in an ISCI network with  2 gb nics, 15K SAS drives would be overkill.&lt;br /&gt;&lt;br /&gt;We had the SAN up and running in under an hour. Another hour and it was building a RAID 6 10TB SAN with a hot spare. (lost 6TB in storage, 2 parity drives and the hot spare). It was immediately usable while it was building. The good news was that the Equallogic &lt;br /&gt;software was easy to use, and it took a few minutes for  the initial configuration.&lt;br /&gt;&lt;br /&gt;The next day, the first R610 arrived. Stick it in a rack, plop in a CD and voila! a 24 (OK 12 core with hyper threading) system with 48GB of ram is runing as a VMWARE server. Each of our R610s has 4 nics. We are using 2 for ISCI traffice, one for VMWARE traffic, and one for gateway access to the internet.&lt;br /&gt;&lt;br /&gt;Some initial performance tests indicated that we were getting reasonable (70MB/sec) access from a single VM to the SAN and aggregate access of about 170MB. Not phenomenal, but much better that our previous environment. The good news was that VMware recognised the SAN and we had options for SNAPSHOTS  etc.   (More on that in a later post).&lt;br /&gt;&lt;br /&gt;We still have some networking issues to resolve, but we needed to see how to migrate from our KVM environment to the VMware  environment.&lt;br /&gt;&lt;br /&gt;The path was simple. (Thank you Kushagra Urs)&lt;br /&gt;&lt;br /&gt;Shut down the KVM VM.&lt;br /&gt;use qemu-img to convert disk from qcow to vmdk (The VMware  disk format)&lt;br /&gt;scp the disk to the VMware server storage (on the SAN).&lt;br /&gt;Create a VM with the same memory, processors, nics etc.&lt;br /&gt;add the disk to the VM...&lt;br /&gt;Boot&lt;br /&gt;&lt;br /&gt;OOPs, we had a problem. VMWARE and KVM addresses have to be unique. The VMware nics came up as eth4 and eth5 (instead of eth0 and eth1). We fixed the problem and rebooted. &lt;br /&gt;&lt;br /&gt;Migration completed. We are testing the migrated system, but so far everything looks good.&lt;br /&gt;&lt;br /&gt;We have 25 more to go..,&lt;br /&gt;&lt;br /&gt;Stay tuned.&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8207235612128882948?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8207235612128882948/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8207235612128882948' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8207235612128882948'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8207235612128882948'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2011/01/clouds-and-sun-and-san.html' title='Clouds and Sun and SAN'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-3382369806910447848</id><published>2010-12-20T07:38:00.000-08:00</published><updated>2011-01-21T16:38:03.475-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='grid processing'/><category scheme='http://www.blogger.com/atom/ns#' term='Dell'/><category scheme='http://www.blogger.com/atom/ns#' term='equallogic'/><category scheme='http://www.blogger.com/atom/ns#' term='vmware'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='san'/><category scheme='http://www.blogger.com/atom/ns#' term='iscsi'/><category scheme='http://www.blogger.com/atom/ns#' term='openfiler'/><title type='text'>New cloud coming</title><content type='html'>We have made our decision, and now are waiting for equipment. We will be replacing our current cloud with a vmware powered cloud running on 2 Dell R610 48GB servers, and a Dell 1950 with 32GB ram. There will be a Dell Equallogic ,16 TB San and an open filer 16TB San providing storage, and a Dell power vault tape backup system running Amanda.&lt;br /&gt;&lt;br /&gt;The two Dell R610's have dual, 6-core processors, and the 1950 has dual quad core processors.&lt;br /&gt;The total of 32 cores should be more than ample to handle all of our current cloud requirements, plus a number of grid processing machines.&lt;br /&gt;&lt;br /&gt;The San will be running iscsi, with dual controllers, each with 3 1gb nics. The SAN controllers are active-passive, i.e only one is active at a time. Each of the R610's will have 4 nics, 3 of which will be used for iscsi and backup traffic. The other nic will be the gateway to the internet.&lt;br /&gt;&lt;br /&gt;We expect to be able to get very respectable io bandwidth from this configuration. Early experiments to an open filer San, running on old equipment with slow drives, was very respectable. The equal logic San should be much faster, and the open filer San will be moved to much faster hardware.&lt;br /&gt;&lt;br /&gt;This new environment should be much more manageable than our current one, especially using the vmware vsphere gui to manage the cloud.&lt;br /&gt;&lt;br /&gt;Vmware and the Equallogic San communicate, so that snapshotting machines is a "snap".&lt;br /&gt;We will be able to automate the backup of our cloud in a way that was cumbersome in our old environment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-3382369806910447848?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/3382369806910447848/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=3382369806910447848' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3382369806910447848'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3382369806910447848'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/12/new-cloud-coming.html' title='New cloud coming'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-4279153265300405922</id><published>2010-11-01T18:35:00.001-07:00</published><updated>2011-02-09T08:54:44.622-08:00</updated><title type='text'>What should our SAN be?</title><content type='html'>We have been experimenting with open filer as our central data repository, but given that it is not a supported vmware environment, we are now looking at hybrid environment with a small (6TB) supported iscsi SAN being the main repository for VMs, and one or more open filer SANs for less important data.&lt;br /&gt;&lt;br /&gt;We need to get our new environment up and running as fast as possible, and this seems to be the way. But it is expensive, as best I can tell, even an inexpensive vmware supported SAN  will cost at least 3 times what an open source solution would be.&lt;br /&gt;&lt;br /&gt;- Posted using BlogPress from my iPad&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-4279153265300405922?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/4279153265300405922/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=4279153265300405922' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4279153265300405922'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4279153265300405922'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/11/what-should-our-san-be.html' title='What should our SAN be?'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2320443538862139647</id><published>2010-10-25T08:13:00.001-07:00</published><updated>2011-02-09T08:53:12.658-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Nic bonding'/><category scheme='http://www.blogger.com/atom/ns#' term='bandwidth'/><category scheme='http://www.blogger.com/atom/ns#' term='iscsi'/><title type='text'>Network bandwidth for research computing</title><content type='html'>As our storage needs have grown dramatically for about 400GB in 2005, to &gt;50TB today, our 1gb network is having trouble keeping up. As part of the move to a new architecture, we are also looking for ways to increase bandwidth. Our san needs to have at least 4 1gb nics, and many of our servers need 2-4gb /sec network speeds.&lt;br /&gt;We have several different types of network demands, general usage (ssh etc), ifs/iscsi traffic, backup traffic, and vmware traffic if we want to do live migration.&lt;br /&gt;&lt;br /&gt;To start, we are going to isolate the different types of traffic onto separate subnets. This will allow us to better monitor and analyze what we need.&lt;br /&gt;We may use nic bonding on certain systems to boost bandwidth. Our experience with bonding has not been very good. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;- Posted using BlogPress from my iPad&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2320443538862139647?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2320443538862139647/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2320443538862139647' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2320443538862139647'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2320443538862139647'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/10/network-bandwidth-for-research.html' title='Network bandwidth for research computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7797454337683451835</id><published>2010-10-25T07:44:00.001-07:00</published><updated>2011-02-09T08:52:13.748-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vmware'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='san'/><category scheme='http://www.blogger.com/atom/ns#' term='64 bit linux'/><title type='text'>Openfiler for cloud computing</title><content type='html'>Our experience with the performance of our inexpensive array, and the flexibility of using Linux raid (madam etc.), is moving us towards rearchitecting our cloud to make it more reliable and easy to manage.&lt;br /&gt;&lt;br /&gt;Our current disk storage is an array attached to the main cloud server. While this is fine for performance,  it makes that system very hard to maintain, since if we take it down, we have to take down the main storage system for both our cloud and grid computing environment. A more standard solution is to have multiple cloud servers attached to a san.&lt;br /&gt;&lt;br /&gt;Having recently discovered that vmware is free for research and instructional use, and since we already use vmware in the university and school for administrative systems, we decided to move to vmware.&lt;br /&gt;&lt;br /&gt;But, in looking at the prices of sans of the size we needed (&gt;50 TB), we looked to open source solutions. Openfilee seemed to be the obvious solution. It can turn a moderately fast system with lots of disk and nics into an easy to manage san, nas, etc. We have taken an old system, and bolted about 1TB of disk onto it for testing. After some mild tuning, we are getting about 100MB/sec write speed, which isn't bad from slow IDE disks. We are now starting to stress test from an 8 core vmware server.&lt;br /&gt;&lt;br /&gt;Our plan is to run openfiler on a much faster box with at least 4 nics and several 20 TB arrays. This will provide the major central storage repository  for both cloud and grid computing. We expect aggregate write speeds of about 400mb/sec, which is plenty for our small environment.&lt;br /&gt;&lt;br /&gt;Stay tuned for updates.&lt;br /&gt;&lt;br /&gt;- Posted using BlogPress from my iPad&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7797454337683451835?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7797454337683451835/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7797454337683451835' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7797454337683451835'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7797454337683451835'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/10/openfiler-for-cloud-computing.html' title='Openfiler for cloud computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7448072568026584805</id><published>2010-10-25T07:26:00.001-07:00</published><updated>2011-02-09T08:54:27.417-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vmware'/><category scheme='http://www.blogger.com/atom/ns#' term='kvm'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='XEN'/><category scheme='http://www.blogger.com/atom/ns#' term='san'/><title type='text'>Moving to vmware for cloud computing</title><content type='html'>After using xen, then KVM we are now moving to vmware. We recently discovered that  vmware is free for research and teaching. Benchmarking now, so far results are encouraging, especially the management tools compared to KVM. Our plan is to move to an iscsi San, with an openfiler as the san.  &lt;br /&gt;- Posted using BlogPress from my iPad&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7448072568026584805?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7448072568026584805/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7448072568026584805' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7448072568026584805'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7448072568026584805'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/10/moving-to-vmware-for-cloud-computing.html' title='Moving to vmware for cloud computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-650393109187012214</id><published>2010-09-10T11:48:00.000-07:00</published><updated>2010-09-10T11:57:42.641-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='mdadm'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='sas'/><category scheme='http://www.blogger.com/atom/ns#' term='raid 6'/><title type='text'>Yet Another Update on our disk array - 5 drives fail?</title><content type='html'>A few days ago, someone was working in the back of the rack that has our raid 6 home built array. Apparently, one of the sas cables was knocked loose. The linux raid software, mdadm, immediately reported 5 failed drives and removed them from the configuration. At that point the array was dead. Reconnecting the cable didn't help, and even a reboot didn't. In fact, as we discovered later, the reboot loaded in a new kernel in which it appears as if the sas port multiplier logic was broken. We have 3  5  port multipliers (See earlier posts) and the new kernel could only recognize the first 4 drives on each port multiplier. After some debugging, and rebooting the correct kernel, we could see all 15 drives.&lt;br /&gt;&lt;br /&gt;Examining the drives (mdadm -E sd[b-p])&lt;br /&gt;showed that 5 drives saw the correct configuration (the 5 that had gone off-line), but the other 10 now saw that the array had 5 failed drives. Raid 6 can only tolerate 2 simultaneous failures... Ouch, what to do. But we knew the drives were fine...&lt;br /&gt;&lt;br /&gt;A call to Dell wasn't very helpful, since we were running Centos 5 on the server.&lt;br /&gt;&lt;br /&gt;In desperation, I posted a note on the linux raid mailing list.&lt;br /&gt;&lt;br /&gt;Responses came back within hours, including one from someone who seemed very knowledgeable, helpful and encouraging. Even asked for the output of mdadm -E, which I immediately sent. A little googling indicated that he (Neil Brown) was the maintainer for mdadm!&lt;br /&gt;&lt;br /&gt;He assured me that just a&lt;br /&gt;mdadm --assemble --force /dev/md0 sd[b-p]&lt;br /&gt;would do no harm, and very likely fix the problem.&lt;br /&gt;&lt;br /&gt;So, we tried it, and after a reboot, everything was back to normal, saving the restore of many TBs of data.&lt;br /&gt;&lt;br /&gt;Certainly one of the advantages of open source is that you often can get directly to the right person, instead of have to wade through many levels of support to find out there is no right person.&lt;br /&gt;&lt;br /&gt;The description of the array can be found at:&lt;br /&gt;http://researchcomputing.blogspot.com/2010_03_01_archive.html&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-650393109187012214?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/650393109187012214/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=650393109187012214' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/650393109187012214'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/650393109187012214'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/09/yet-another-update-on-our-disk-array-5.html' title='Yet Another Update on our disk array - 5 drives fail?'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-3063424389789047310</id><published>2010-08-17T07:54:00.000-07:00</published><updated>2010-08-17T07:54:51.999-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='yum upgrade'/><category scheme='http://www.blogger.com/atom/ns#' term='server cloning'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='qemu-img'/><category scheme='http://www.blogger.com/atom/ns#' term='Centos'/><category scheme='http://www.blogger.com/atom/ns#' term='64 bit linux'/><category scheme='http://www.blogger.com/atom/ns#' term='snapshot'/><title type='text'>Using the cloud to test updating of physical servers</title><content type='html'>Some of our servers have are running older versions of Centos, but the servers are so heavily used that we have postponed upgrading them to the current version. The situation has become increasingly intolerable, as our older servers can't run some of our software due to library incompatibility problems. However, we were reluctant to try upgrading them without some form of testing first, due to the inconvenience we would cause our users if we were down for a significant period of time.&lt;br /&gt;&lt;br /&gt;We are working on two solutions, using our cloud environment. &lt;br /&gt;&lt;br /&gt;The first is to approximately replicate the environment of the older servers by building virtual servers that run the same version of Centos (4.8) and run all of our main applications. We have successfully done this, and then created a "frozen" version of that system  and replaced the original system disk with a qcow2 disk that only has the changes, from a qemu-img snapshot. That way we can restart the process if we run into problems, by just reverting to the snapshot and trying again.&lt;br /&gt;&lt;br /&gt;The second method is to actually "clone" the physical system, so that we have an exact replica running in the cloud. We  can then test updating the replica. Obviously, this is the preferable way to go.&lt;br /&gt;&lt;br /&gt;We have now succeeded upgrading the "approximate" replica from Centos4.8 to Centos 5.3, and then yum update it to Centos 5.5. The process went extremely smoothly, with the exception that we had to upgrade yum from Centos 4.8 to Centos 5.3 after the upgrade. That was just a yum upgrade yum command. A yum update the completed the upgrade and brought the Centos 5.3 system up to Centos 5.5. As far as we can tell, all of our applications still run.&lt;br /&gt;&lt;br /&gt;We are still struggling with cloning the actual physical server. As best we can tell, you need to take the server down to clone it. That would be fine if we wanted to replace it with a virtual server running in our cloud, but the process of cloning (i.e. dd'ing all of the system disks), will likely take more time than the actual upgrade (assuming the upgrade runs as flawlessly as our upgrade of the approximate replica).&lt;br /&gt;&lt;br /&gt;In any event, being able to clone a physical server exactly is a very handy capability, so we will continue our experiment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-3063424389789047310?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/3063424389789047310/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=3063424389789047310' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3063424389789047310'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3063424389789047310'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/08/using-cloud-to-test-updating-of.html' title='Using the cloud to test updating of physical servers'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-6076732786137663853</id><published>2010-06-13T09:36:00.000-07:00</published><updated>2010-06-13T09:36:24.190-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mdadm'/><category scheme='http://www.blogger.com/atom/ns#' term='sata drives'/><category scheme='http://www.blogger.com/atom/ns#' term='software raid'/><category scheme='http://www.blogger.com/atom/ns#' term='lvm'/><category scheme='http://www.blogger.com/atom/ns#' term='udev'/><category scheme='http://www.blogger.com/atom/ns#' term='raid 6'/><title type='text'>Another update on the cheap disk array</title><content type='html'>From earlier posts, you can see our ongoing attempt to build the cheapest, but still reliable and reasonably fast storage array. For about $3,500 we built a 30TB array from an inexpensive storage box from Addonics, and 15 2TB hitachi disks, 4 port multipliers, and a SIL sas card. For a variety of reasons, we choose to use the linux raid software to build a RAID 6 volume with a hot spare and 2 parity disks, giving us about 24TB of usable space. Initial performance tests were encouraging, with the array being much faster than we had feared (about 150MB/s sustained, bursts to well over 400MB/s). We have moved about 3TB of files onto it, and have been waiting to see if any components failed. So far so good. &lt;br /&gt;&lt;br /&gt;Our next test was to see what happened if we powered it down (after unmounting it) and then powered it back up. Very interesting! The raid 6 array had been built from devices /dev/sda --&gt; /dev/sdp (15 sata drives). When we powered the array back up, we discovered that the original raid volume (/dev/md0) was showing all drives as failed.&lt;br /&gt;&lt;br /&gt;After looking around, we discovered that the drives were now at /dev/sdq --&gt;/dev/sdab.&lt;br /&gt;&lt;br /&gt;Annoying, but we "reassembled" a new array, /dev/md1 from the new drive locations.&lt;br /&gt;&lt;br /&gt;Voila, we had a newly constructed device with apparently all of the files. BUT,&lt;br /&gt;lvm was looking for a physical device at /dev/md0, and now it could see the physical disk for it's volume group at /dev/md1. It didn't seem to like that.&lt;br /&gt;After some fiddling, I discovered that if I made the original volume group inactive (mdamd -an vg_t30a_vg1)&lt;br /&gt;&lt;br /&gt;I could export it, and then import it back, and things seemed to work. But the question then became what happens when we reboot? Sure enough after a reboot, the drives were now back at /dev/sda --&gt; /dev/sdp. Ouch!&lt;br /&gt;&lt;br /&gt;Actually, it wasn't even that simple, after a reboot, the drives that came back were /dev/sda --&gt;/dev/sdm  (i.e. 3 missing drives). More fiddling, and we discovered if we disconnected the 3 sas cables, and then reconnected them we could see all 15 drives.&lt;br /&gt;&lt;br /&gt;Reassembling the array  and now magically everything worked, no data lost, etc.&lt;br /&gt;&lt;br /&gt;But, not a stable solution, since just powering the array down and back up would mean either a reboot of the system, or lots of fiddling.&lt;br /&gt;&lt;br /&gt;Some googling found what appears to be the solution. &lt;a href="http://en.wikipedia.org/wiki/Udev"&gt;UDEV &lt;/a&gt;rules that will always map the drives to the same /dev location.&lt;br /&gt;&lt;br /&gt;We will be trying that next week. Assuming it works, the array will be put in production, and we may build another one.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-6076732786137663853?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/6076732786137663853/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=6076732786137663853' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6076732786137663853'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6076732786137663853'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/06/another-update-on-cheap-disk-array.html' title='Another update on the cheap disk array'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7646778608682668620</id><published>2010-06-07T09:22:00.000-07:00</published><updated>2010-06-07T09:28:49.333-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='research computing'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><title type='text'>Private Clouds becoming increasingly popular</title><content type='html'>In some earlier posts, I described the private compute cloud we have in research computing and how it solves the problems of how to quickly provision machines/services for researchers in a much less expensive way than using a public cloud service like Amazon or Google. I had predicted that private clouds would be widely adopted by many enterprises.&lt;br /&gt;&lt;br /&gt;A recent Information Week article indicates that this has begun to happen.&lt;br /&gt;http://www.informationweek.com/news/hardware/data_centers/showArticle.jhtml;jsessionid=FUHXDKIWXGBIBQE1GHPSKH4ATMY32JVN?articleID=225300320&amp;cid=nl_IW_week_2010-06-07_h&lt;br /&gt;&lt;br /&gt;According to this article, 58% of businesses are either already hosting a private cloud environment, or plan to do it soon.&lt;br /&gt;&lt;br /&gt;The article gives as an example, Indiana University, which already has more than 1200 machines running in the cloud. Typical time to create a new machine for a user is less than 4 hours.  (Disposal time is seconds).&lt;br /&gt;&lt;br /&gt;At Stern, we are now getting a request every week or so, but we are  very small compared to the whole university.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7646778608682668620?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7646778608682668620/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7646778608682668620' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7646778608682668620'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7646778608682668620'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/06/private-clouds-becoming-popular.html' title='Private Clouds becoming increasingly popular'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8986041371694271180</id><published>2010-05-23T09:09:00.000-07:00</published><updated>2010-05-23T09:09:21.267-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='sofware raid'/><category scheme='http://www.blogger.com/atom/ns#' term='mdadm'/><category scheme='http://www.blogger.com/atom/ns#' term='hardware raid'/><category scheme='http://www.blogger.com/atom/ns#' term='mdadm resync'/><category scheme='http://www.blogger.com/atom/ns#' term='raid 6'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><category scheme='http://www.blogger.com/atom/ns#' term='64 bit linux'/><title type='text'>New Disk Array now built and working - still testing</title><content type='html'>We have now full populated our 15 disk array with 2TB Hitachi drives. There was a fair amount of angst in getting our dual quad core Dell 1950 to recognize all of the drives at the same time.  As best I can tell, we need to move up to a newer linux kernel which fixes some problems in recognizing multiple disks on port multipliers. Our array is configured with 3  1--&gt;5 port  multipliers, connect to a 4 port sas controller. As I have mentioned in previous posts, the plan was to have a slow, inexpensive but very reliable storage device. The array is now configured as raid-6 with one hot spare, giving us about 24TB unformatted, 22 TB formatted. Initial test with iozone showed that it is much faster than I had feared. with read speeds of close to 1GB/sec and sustained write speeds of about 150MB/sec. Write speeds for small files are much faster, usually in the 500-600MB/sec range. We are still testing it to make sure it is stable (this is our first experience with linux raid). &lt;br /&gt;&lt;br /&gt;It took about 12 hours to resync the array once all of the drives had been installed. (50 MB /sec per drive for 2TB).&lt;br /&gt;&lt;br /&gt;It has been up for several days as we run tests on it. Everything seemed fine until this morning, when I noticed that the array was being resynced. I have mdadm (the linux raid manager) supposedly sending out emails if there are any problems, so I was puzzled (and worried) since I hadn't received any emails. After sifting through log files and cron jobs, I found the culprit(?). Red Hat apparently does a raid check every week on all raid devices by default. The simplest way to do this is to just force a resync. Different linux distributions do it at different intervals (Unbuntu, apparently once a month. One really needs to know about the schedule, since it takes a significant amount of cpu (almost 1 whole processor on an 8 processor system) to do the resync. &lt;br /&gt;&lt;br /&gt;For a minute, I was worried that we had made a huge mistake going to software raid. But having lost significant amounts of data as well as other problems with hardware raid, we really wanted to give software raid a try. I may change the schedule on the resync, so it is less frequent, but the tradeoff is that at least you know that every sector on every disk is still readable, and it will catch errors that might not show up for months.&lt;br /&gt;&lt;br /&gt;http://linux.yyz.us/why-software-raid.html&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Has a very nice overview of the advantages of both software raid and hardware raid.&lt;br /&gt;Each has it's problems.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8986041371694271180?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8986041371694271180/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8986041371694271180' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8986041371694271180'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8986041371694271180'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/05/new-disk-array-now-built-and-working.html' title='New Disk Array now built and working - still testing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8981449649349492778</id><published>2010-05-10T07:40:00.000-07:00</published><updated>2010-05-10T07:40:33.896-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Chromium OS'/><category scheme='http://www.blogger.com/atom/ns#' term='Africa'/><category scheme='http://www.blogger.com/atom/ns#' term='4G'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='netbooks'/><category scheme='http://www.blogger.com/atom/ns#' term='3G'/><category scheme='http://www.blogger.com/atom/ns#' term='less developed countries'/><category scheme='http://www.blogger.com/atom/ns#' term='Wireless'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='wifi'/><category scheme='http://www.blogger.com/atom/ns#' term='cell phones'/><category scheme='http://www.blogger.com/atom/ns#' term='ipad'/><category scheme='http://www.blogger.com/atom/ns#' term='solar power notebooks'/><category scheme='http://www.blogger.com/atom/ns#' term='server farms'/><title type='text'>Chromium OS and cloud computing for Africa and  less developed countries</title><content type='html'>Google's recent announcement of Chromium, their new cloud based operating system has many interesting ramifications. &lt;br /&gt;&lt;br /&gt;First, any system that can boot up in seconds is a game changer. Chromium should be up and usable faster than a cell phone. &lt;br /&gt;&lt;br /&gt;Second, it will allow a much lower fixed cost entry point for computing, since all of the work will be done in the cloud. A user's PC will only need to be able to run chromium, the Chrome browser and have an internet connection. They won't need a hard drive, Windows, linux, etc. Google and others are working on cloud based printing support, which would further eliminate configuration problems.&lt;br /&gt;&lt;br /&gt;One clear win for this approach is in developing countries, where one can imagine cloud based computing being done over 4g wireless such as is now being rolled out in South Africa. Many of the problems that have inhibited computer use in the third world can be eliminated, such as unstable power, lack of air conditioning, lack of technical support, etc. Countries could become their own cloud providers, or use companies like Google or Amazon or other third party providers to provide computing power. One could even imagine the UN hosting cloud services for the third world.&lt;br /&gt;&lt;br /&gt;Hardware requirements would be minimal compared to say the minimum hardware for Windows 7. Low cost, low power processors would be more than ample. netbooks, Ipads, or even used notebooks would be more than sufficient. One could even imagine a second life for old notebooks as they are stripped of hard drives (Chromium already will boot from a USB drive)and sent to the third world.&lt;br /&gt;&lt;br /&gt;Another alternative would be a simple netbook, with all components designed to be field replaceable, so even non-technicians could replace memory or a card.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Bandwidth is the obvious problem, especially into and out of the country. But this is a much easier problem to handle, than providing support services for millions of distributed PCs.  Google already drops in server farms in a truck. &lt;br /&gt;&lt;br /&gt;Power is another problem. Solar power is enough to power and charge the batteries of notebooks, but maybe not enough to power cell towers, and certainly not enough to power cloud server farms (at least not now). However, the total power requirements should be much less than would have been used by a more conventional approach to computing, and it would provide open access to virtually (no pun intended) unlimited computing power.&lt;br /&gt;&lt;br /&gt;I welcome comments....&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8981449649349492778?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8981449649349492778/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8981449649349492778' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8981449649349492778'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8981449649349492778'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/05/chromium-os-and-cloud-computing-for.html' title='Chromium OS and cloud computing for Africa and  less developed countries'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8043010200348371719</id><published>2010-04-25T14:07:00.000-07:00</published><updated>2010-04-26T11:38:08.934-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='raid 6'/><category scheme='http://www.blogger.com/atom/ns#' term='64 bit linux'/><title type='text'>Update on Disk Array</title><content type='html'>We have our array installed and have put a small raid 6 array on 4 2TB disks. Took about 3 1/2 hours to build the array using mdadm. Building the array consisted of several steps.&lt;br /&gt;1) Rebooting and looking at dmesg to find the device addresses for the new drives.&lt;br /&gt;dmesg|grep scsi found them.&lt;br /&gt;2) Run fdisk to put a partition of type "fd" on each drive.&lt;br /&gt;3) run mdadm --create /dev/md0 --level=6 /dev/sdc /dev/sdd /dev/sde /dev/sdf&lt;br /&gt;4) Mount /dev/md0&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Initial performance tests show write speeds of about 70MB /sec which is somewhat better than I had expected. Remember, the disks are slow 7200 rpm 2TB Hitachi drives which cost around $150 each Once we fully populate the array, I plan to have 12 disks, plus 2 parity plus a hot spare.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Total cost for 24TB will be about $5200. There is also the likelihood that the performance will improve somewhat as we add more disks, unless we are already constrained by writing to the parity disks.&lt;br /&gt;&lt;br /&gt;Probably about as inexpensive as one can get storage. We are going to use it for storage of large, mostly static data or on-line backups of other file systems that will eventually be rolled off to tape.&lt;br /&gt;&lt;br /&gt;&lt;iframe src="http://www.facebook.com/widgets/like.php?href=http://researchcomputing.blogspot.com/" scrolling="no" frameborder="0" style="border:none; width:450px; height:80px"&gt;&lt;/iframe&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8043010200348371719?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8043010200348371719/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8043010200348371719' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8043010200348371719'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8043010200348371719'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/04/update-on-disk-array.html' title='Update on Disk Array'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2144723579348813953</id><published>2010-03-25T15:14:00.000-07:00</published><updated>2010-09-10T11:24:01.276-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='disk storage'/><category scheme='http://www.blogger.com/atom/ns#' term='sata disk'/><category scheme='http://www.blogger.com/atom/ns#' term='raid array'/><category scheme='http://www.blogger.com/atom/ns#' term='sas'/><title type='text'>Really inexpensive raid6 disk array</title><content type='html'>One of our problems in research computing is that we have projects that have massive amounts of data, where the data is being collected daily, but often only a small percentage is ever accessed. One example is the Options Price Reporting Authority (OPRA)options data which we receive every night from the International Stock Exchange (ISE). IT varies from about 25GB Compressed per night up to 60GB on days of high volatility. We recently decided to see how we could build a highly reliable but very inexpensive storage device. This is what we eventually decided on.&lt;br /&gt;&lt;br /&gt;We bought a 15 bay storage unit from Addonics (about $1,200). It has 3 sas connectors which each go to a  port multiplier that has 5 sata connections. We bought an inexpensive ($350) sas card which has 4 sas ports. We will populate the array with 15 2TB inexpensive SATA disks from Hitachi. Cost about $180 each.&lt;br /&gt;&lt;br /&gt;The Addomics parts are:&lt;br /&gt; 1 SR460RHPM 4U rack mount storage chassis with one port multiplier&lt;br /&gt;and redundant power supply&lt;br /&gt; 2 AD5SARPM-E eSATA port multipliers for rackmount system.&lt;br /&gt;&lt;br /&gt;We had Addonics install everything before shipping.&lt;br /&gt;&lt;br /&gt;We then bought 15 Hitachi 2TB SATA drives&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Total cost is in the range of $4,500, which is less than 50% of an inexpensive RAID array, and MUCH less expensive that a SAN or NAS. We plan to use the linux raid 6 software support to raid the disks together, with two parity disks and a hot spare.&lt;br /&gt;That way we will have to lose two disks within a very short time window before we will have any data loss. So, we will have 24TB of storage for under $5,000.&lt;br /&gt;&lt;br /&gt;We expect it will be quite slow, probably around 50MB /sec due to the contention for the 3 sas channels. But for some of our data, that is not really a problem. This device will allow us to move data off much higher performance devices in our storage hierarchy. Stay tuned for benchmarks.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2144723579348813953?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2144723579348813953/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2144723579348813953' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2144723579348813953'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2144723579348813953'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/03/really-inexpensive-raid6-disk-array.html' title='Really inexpensive raid6 disk array'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2636226131139933299</id><published>2010-03-18T06:55:00.000-07:00</published><updated>2010-03-18T06:55:41.143-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tesla'/><category scheme='http://www.blogger.com/atom/ns#' term='gpu'/><category scheme='http://www.blogger.com/atom/ns#' term='Nvidia'/><category scheme='http://www.blogger.com/atom/ns#' term='fermi hpc'/><category scheme='http://www.blogger.com/atom/ns#' term='Accelereyes'/><category scheme='http://www.blogger.com/atom/ns#' term='matlab'/><category scheme='http://www.blogger.com/atom/ns#' term='Jacket'/><category scheme='http://www.blogger.com/atom/ns#' term='Cuda'/><title type='text'>New Accelereyes Jacket support really works</title><content type='html'>We finally had time to test out the new release of "Jacket", the matlab interface to the NVIDIA Cuda library. This pre-release has a number of new features, including support for the matlab "inv" (matrix inversion ) function. Initial results are very promising, with a 8-10 times performance increase for code that does very large multiple regressions. (We are running on a dual quad core Super Micro machine with 24 GB of ram and an Nvidia Tesla C1060 GPU). Results for the about to be released "Fermi" GPU (512 corses vs 240) should be even more impressive.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2636226131139933299?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2636226131139933299/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2636226131139933299' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2636226131139933299'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2636226131139933299'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/03/new-accelereyes-jacket-support-really.html' title='New Accelereyes Jacket support really works'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-9036155047984247769</id><published>2010-03-14T09:23:00.000-07:00</published><updated>2010-03-14T09:24:36.474-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='gpu'/><category scheme='http://www.blogger.com/atom/ns#' term='linear algebra'/><category scheme='http://www.blogger.com/atom/ns#' term='matlab'/><category scheme='http://www.blogger.com/atom/ns#' term='Jacket'/><title type='text'>Jacket GPU support for Nvidia Tesla</title><content type='html'>Accelereyes is about to release version 1.3 of their Jacket matlab interface to the Nvidia CUDA library. Jacket had some major limitations for our applications, since it did not support the matlab INV function. That has now been added, along with some other capabilities.&lt;br /&gt;&lt;br /&gt;We have a prerelease that we are trying to get working now. Currently seems to be a problem installing it on 64 bit platforms, but I am sure that we can get through that.&lt;br /&gt;&lt;br /&gt;I have listed some of the additions below:&lt;br /&gt;&lt;br /&gt;Additions:&lt;br /&gt;&lt;br /&gt;conv2 (separable)&lt;br /&gt;sort (matrices, volumes, indexed output)&lt;br /&gt;sortrows, issorted&lt;br /&gt;eps, nextpow2, nthroot, realpow&lt;br /&gt;nnz&lt;br /&gt;toeplitz, hankel&lt;br /&gt;tril, triu&lt;br /&gt;sind, cosd, trand&lt;br /&gt;&lt;br /&gt;Linear algebra additions:&lt;br /&gt;&lt;br /&gt;det&lt;br /&gt;filter&lt;br /&gt;norm (matrices)&lt;br /&gt;eig, svd&lt;br /&gt;inv, mldivide&lt;br /&gt;chol, lu, qr&lt;br /&gt;mpower&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-9036155047984247769?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/9036155047984247769/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=9036155047984247769' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9036155047984247769'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9036155047984247769'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/03/jacket-gpu-support-for-nvidia-tesla.html' title='Jacket GPU support for Nvidia Tesla'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8552862309023916924</id><published>2010-03-14T09:11:00.000-07:00</published><updated>2010-03-14T09:16:19.064-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mobile phones'/><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='distributed computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Android'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='military applications'/><title type='text'>Hadoop on Mobile Phones?</title><content type='html'>Seems like a crazy idea, but what would happen if one could run hadoop on a cluster of mobile phones? Soon, mobile phones will have the local disk memory, network speed and ram to handle hadoop. hadoop is completely written in java, the question is whether it needs java features that are not in Java ME. As I remember, all of hadoop's communications are on top of http. Seems like an interesting student project, investigate what is missing in java ME for hadoop to run.&lt;br /&gt;Google's Android platform would seem to be an obvious one to start with.&lt;br /&gt;&lt;br /&gt;But why do it? &lt;br /&gt;&lt;br /&gt;I need to think about that. One would have to have a distributed data base that would have enough replicas that even if only a small percentage of collaborating phones were available, the data would be available. Note that hadoop would handle different phones leaving and joining the network, although network traffic would certainly be an issue. The database could also be one in which the users  with hadoop nodes, could also add to the data base. &lt;br /&gt;&lt;br /&gt;I think that the first likely application area would be in the military, where cell towers could be set up in an area, and all troops could immediately start sharing information. &lt;br /&gt;&lt;br /&gt;Need to think more about this...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8552862309023916924?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8552862309023916924/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8552862309023916924' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8552862309023916924'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8552862309023916924'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/03/hadoop-on-mobile-phones.html' title='Hadoop on Mobile Phones?'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7887595720978394806</id><published>2010-03-07T14:17:00.000-08:00</published><updated>2010-03-07T14:17:10.284-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Grid Engine'/><category scheme='http://www.blogger.com/atom/ns#' term='distributed computing'/><title type='text'>Sun Grid Engine 6.2u5 and Hadoop</title><content type='html'>I just discovered that the latest version of Sun Grid Engine now integrates with Hadoop, so you can use it to schedule and run hadoop jobs. Since we run both Sun Grid Engine and hadoop, this should help streamline our environmnt.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7887595720978394806?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7887595720978394806/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7887595720978394806' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7887595720978394806'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7887595720978394806'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/03/sun-grid-engine-62u5-and-hadoop.html' title='Sun Grid Engine 6.2u5 and Hadoop'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-3036469953205207488</id><published>2010-02-18T11:33:00.000-08:00</published><updated>2010-02-18T11:57:36.693-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='gpu'/><category scheme='http://www.blogger.com/atom/ns#' term='MPI'/><title type='text'>Interesting HPC problem</title><content type='html'>We have just encountered what must be a generic problem in HPC, but one which is a little difficult to parallelize. The basic problem statement is very simple:&lt;br /&gt;&lt;br /&gt;Given a matrix of N rows and K columns, compare every row to every other row, counting the number of columns that are different (hamming distance). In our particular case, the researcher is only interested in those row pairs  which have the "minimum" number of differences bteween them.&lt;br /&gt;&lt;br /&gt;Seems simple, although it is of order K*N^2/2. Naive approaches blow up fairly quickly, due to the possible output size which N^2 / 2 Row pair, value combinations.&lt;br /&gt;(We have 8 million rows and 500 columns). Without some type of output reduction, just the comparison results are 32*10^12 rows.&lt;br /&gt;&lt;br /&gt;So there are 2 basic problems:&lt;br /&gt;&lt;br /&gt;1) Those (N^2)/2 comparisons need to be made very quickly. In our case we have binary data, so it looks like a series of Exclusive Ors followed by using the intel sse4 popcnt instruction (population count), will do a single comparison very quickly. It is even possible only do a partial comparison, which "gives up" if it finds more than the minimum differences (but that adds additional comparisons).&lt;br /&gt;&lt;br /&gt;2) The huge size of the output can be at least reduced by filtering the output, so that only output that is &lt;= the current minimum is output (or maybe some tolerance value). &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Depending on the problem size and representation, it may be able to be done in memory, but that makes it harder to do in parallel. In our case, since we can represent our columns as bits, that is probably how we will proceed. But in general, that is unlikely. We also have a GPU equipped system, which in theory could be programmed to do the comparison in parallel, although the data movement probably outweighs the savings in comparison time.&lt;br /&gt;&lt;br /&gt;The more general solution seems to be either some type of MPI solution which can at least distribute the comparisons across many processors, or a distributed parallel system like hadoop. &lt;br /&gt;&lt;br /&gt;I will discuss these alternatives in a later post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-3036469953205207488?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/3036469953205207488/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=3036469953205207488' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3036469953205207488'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3036469953205207488'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/02/interesting-hpc-problem.html' title='Interesting HPC problem'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8495356218703658683</id><published>2010-02-10T10:08:00.001-08:00</published><updated>2010-02-17T10:37:53.045-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='graphics processing unit'/><category scheme='http://www.blogger.com/atom/ns#' term='gpu'/><category scheme='http://www.blogger.com/atom/ns#' term='fermi hpc'/><category scheme='http://www.blogger.com/atom/ns#' term='statistical computing'/><category scheme='http://www.blogger.com/atom/ns#' term='processor chips'/><category scheme='http://www.blogger.com/atom/ns#' term='64 bit linux'/><title type='text'>Nvidia GPU has arrived</title><content type='html'>We have finally received our Super Micro Nvidia Tesla equipped server that we hope to use to speed up large matlab programs. So far, there has been a fair amount of frustration getting all of the components lined up. The matlab add-on toolkit, "Jacket" from Accelereyes, relies on the CUDA toolkit from NVIDIA, which in turn needs lot's of other packages installed.  We had to rebuild the server under Fedora 10, and reinstall everything from scratch. Once the CUDA sdk examples were running, we thought we were finished.&lt;br /&gt;&lt;br /&gt;Not quite, we also needed the correct version of matlab for the jacket software, and even then things didn't work. Finally, setting the MATLAB_JAVA environment variable got everything working.&lt;br /&gt;&lt;br /&gt;Initial tests show some promise, but you can't just run your unchanged matlab code.&lt;br /&gt;You need to indicate to matlab what variables you want to be processed on the GPU. The GPU only supports a subset of full matlab functionality, so it is a fair amount of experimentation to even get code running.&lt;br /&gt;&lt;br /&gt;Once running, there is another level of code rewriting to try to get as much of the computation done in the GPU as possible. It is obviously best suited for applications that spend most of their time doing things like manipulating large vectors and matrices.&lt;br /&gt;&lt;br /&gt;Many applications will run much SLOWER, since they may spend more time moving data to and from the GPU than they do in the actual processing. Our early attempts have only found a very modest speedup at best, but we know that those are not obvious candidates for a large speedup.&lt;br /&gt; &lt;br /&gt;One application that seems useful is multiple regression. Unfortunately, right now one can only do it in single precision mode using the  \ operator instead of inversion of the X'X matrix. However, for small data sets, the gpu is slower, but for really large ones say 500 variables and 50000 observations, it is about 10 times faster.&lt;br /&gt;&lt;br /&gt;Not bad.&lt;br /&gt;&lt;br /&gt;Dr. Dobbs Journal has a series of articles on the CUDA architecture. Very helpful in providing background on what Jacket is doing behind the scenes.&lt;br /&gt;http://www.drdobbs.com/hpc-high-performance-computing/207200659&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8495356218703658683?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8495356218703658683/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8495356218703658683' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8495356218703658683'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8495356218703658683'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/02/nvidia-gpu-has-arrived.html' title='Nvidia GPU has arrived'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-855959186434688946</id><published>2010-01-08T10:36:00.001-08:00</published><updated>2010-01-08T10:48:04.432-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='internal cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='private cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><title type='text'>Private Cloud Computing</title><content type='html'>Although this is somewhat (but not much) off topic from research computing, I have to point out that I believe the most interesting aspect of cloud computing is in private clouds, i.e. a cloud computing infrastructure hosted internally instead of externally.&lt;br /&gt;The organization still has many of the benefits of cloud computing, but without many of the concerns (security, privacy, control, etc.)&lt;br /&gt;&lt;br /&gt;Building a cloud computing environment is not that difficult, especially for an organization that already has a major IT investment. CEOs should not be looking to outsource all of their computing to an outside cloud vendor, but rather ask their CIO why there isn't an internal cloud environment.&lt;br /&gt;&lt;br /&gt;As some of my earlier posts indicate, there are many advantages to cloud computing, even in an HPC environment, or a research environment. IN a later post, I will share a student project on building a self provisioning cloud environment, as well as the advantages and disadvantages of cloud computing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-855959186434688946?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/855959186434688946/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=855959186434688946' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/855959186434688946'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/855959186434688946'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2010/01/private-cloud-computing.html' title='Private Cloud Computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-4062619150153837640</id><published>2009-10-23T08:49:00.000-07:00</published><updated>2009-10-23T09:00:40.882-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tesla'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='gpu'/><category scheme='http://www.blogger.com/atom/ns#' term='cluster computing'/><category scheme='http://www.blogger.com/atom/ns#' term='fermi hpc'/><category scheme='http://www.blogger.com/atom/ns#' term='nvida'/><category scheme='http://www.blogger.com/atom/ns#' term='matlab'/><category scheme='http://www.blogger.com/atom/ns#' term='fft'/><title type='text'>HPC and GPU computing</title><content type='html'>Our small grid is quite busy, and AC and power limit how large it can expand. Since most of our jobs are heavy CPU intensive matlab jobs, we have started to investigate the purchase of a server with one or two GPUs, probably nvidia Tesla's, since the CUDA architecture has add-on support in matlab, where a new matlab datatype indicates that the data should be operated in the gpu if possible. The Tesla gpu has 240 cores, while the new Fermi gpu (not available yet) will have 512. For applications which have lot's of inner products, or fft's (fast fourier transforms), the gpu enabled code should run 10-100 times faster. &lt;br /&gt;&lt;br /&gt;There are a number of vendors that have 1u servers that have 2 nehalem (intel) quad core processors and can take 1 or 2 nvidia tesla gpu's. The plan is to buy one for testing, and then upgrade to the fermi gpu in the spring. If successful, this will  become our standard HPC server.&lt;br /&gt;&lt;br /&gt;If anyone has experience with matlab and nvidia gpus, I would love to talk to them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-4062619150153837640?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/4062619150153837640/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=4062619150153837640' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4062619150153837640'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4062619150153837640'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2009/10/hpc-and-gpu-computing.html' title='HPC and GPU computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2739588540076484492</id><published>2009-10-23T08:30:00.001-07:00</published><updated>2009-10-23T08:49:22.655-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='live migration'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='virtual servers'/><title type='text'>Update on the Stern Cloud</title><content type='html'>Our internal "cloud" has come along nicely, with most of our virtual machines migrated from xen to kvm. We have had two recent examples of why cloud computing is so useful.&lt;br /&gt;The first was several weeks ago, when some researchers came to me on a Thursday at 4:30pm. They had just learned that a year long project they had been working on was threatened because the small company that had been helping them gather data was out of money. The servers that were being used for gathering social network data were going to be turned off in the next week, and a year long research effort was going to be severely impacted. &lt;br /&gt;&lt;br /&gt;After some long conversations about bandwidth, number of servers, memory, processing and disk, it seemed feasible to move the whole set of servers into our internal cloud.&lt;br /&gt;Some small glitches (we didn't have the iso's for the slightly outdated version of Centos they were running) were quickly overcome, and by Friday at 1pm, they had 4 servers configured with all the applications they needed. A consultant migrated the &lt;br /&gt;data gathering application to the new environment over the weekend. By Monday the whole environment had been moved and was up and running. Needless to say, the researchers were very happy. This would have been completely impossible if we had to order machines, configure them, etc. It would have been many weeks. Another alternative was Amazon EC2, but the costs would have been prohibitive.&lt;br /&gt;&lt;br /&gt;The second success story was last week, when we wanted to upgrade some servers with more memory to support our rapidly growing cloud. This was our first test of "live migration", where you move a virtual server from one physical server to another while the virtual server is still running. We moved the workload from one physical server to another, took the first server down, upgraded it's memory, brought it back up, and migrated the workload back. Somewhat to our surprise, it worked perfectly.&lt;br /&gt;&lt;br /&gt;Very nice...&lt;br /&gt;&lt;br /&gt;Our next project is to let users design their own systems and self provision. We have most of the machinery in place. A user will fill out a form, and about 30 minutes later have a virtual server configured to their specs, with a choice of OS, ram, processors, disk, nics , etc.&lt;br /&gt;&lt;br /&gt;The real problem is one of control and accountability. We'll work on that next.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2739588540076484492?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2739588540076484492/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2739588540076484492' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2739588540076484492'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2739588540076484492'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2009/10/update-on-stern-cloud.html' title='Update on the Stern Cloud'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8484855769642467051</id><published>2009-07-29T10:44:00.000-07:00</published><updated>2009-07-29T10:54:44.040-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='research computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Virtual Machines'/><category scheme='http://www.blogger.com/atom/ns#' term='qemu'/><category scheme='http://www.blogger.com/atom/ns#' term='grid processing'/><category scheme='http://www.blogger.com/atom/ns#' term='kvm'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='XEN'/><title type='text'>Cloud Computing comes to Stern</title><content type='html'>My last post convinced me that HPC and cloud computing have some synergy. Since then, we have implemented a small "cloud" with 5 physical servers and 30TB of shared disk. After some fiddling with  QEMU-KVM, we now have an environment where we can&lt;br /&gt;1) Quickly purpose new machines, give them disk, cpu, public and private network access etc.&lt;br /&gt;2) Move virtual machines easily from one server to another.&lt;br /&gt;3) Move Xen images to KVM (Still testing)&lt;br /&gt;4) Replace physical machines with virtual machines, includig nics. (very handy if you want to move a web server, just put up a virtual machine on the same network with the same mac address). It can display a "web server down until X" message, move server just before it comes up, take down the virtual server.&lt;br /&gt;5) Provide researchers with the computing power they need, when they need it, but use idle cycles for other users.&lt;br /&gt;&lt;br /&gt;Soon, most of our grid processing will be "in the cloud" and independent of physical servers. Those month long jobs that disrupt scheduling, can easily be moved from server to server if necessary.&lt;br /&gt;&lt;br /&gt;I'll be on "cloud 9".&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8484855769642467051?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8484855769642467051/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8484855769642467051' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8484855769642467051'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8484855769642467051'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2009/07/cloud-computing-comes-to-stern.html' title='Cloud Computing comes to Stern'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-3141920704401078067</id><published>2009-05-05T06:32:00.000-07:00</published><updated>2009-10-25T07:30:26.346-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='linear programming'/><category scheme='http://www.blogger.com/atom/ns#' term='assignment problem'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='job scheduling'/><title type='text'>Scheduling in a heterogeneous grid computing environment - Some thoughts</title><content type='html'>Scheduling  in any non-trivial environment is known to be a difficult task, often resulting in combinatorially difficult problems (NP-Complete). It is difficult in a homogeneous grid environment, where all of the nodes are identical. It is much more difficult in a heterogeneous environment such as the Stern grid, where the nodes are different speeds, differing amounts of memory, disk etc. For instance, a standard problem as the grid gets busy, is that all of the faster nodes will fill up, and then jobs will eventually be scheduled on the slower nodes. But, as the faster nodes finish jobs, there may be idle capacity on the faster nodes.This doesn't help the jobs which are "stuck" on the slower nodes, as they slowly grind along. This both increases turnaround time, and lowers throughput. &lt;br /&gt;&lt;br /&gt;The solution would be to move the still running jobs to the faster nodes, where they would complete more quickly. This is of course not feasible.... or is it.&lt;br /&gt;&lt;br /&gt;Imagine superimposing a virtual environment on the grid, so that each real machine ran as a virtual host for processing nodes. Using a standard feature of most virtualization (or cloud computing environments), the workload could easily (and automatically) be moved as necessary. In fact, optimizing the workload becomes a simple linear programming (actual an assignment problem) model, which could be executed as necessary, to see if there is a better "assignment" of jobs to machines. In fact as the workload builds up,some of the jobs may be moved back to the slower nodes if necessary.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-3141920704401078067?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/3141920704401078067/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=3141920704401078067' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3141920704401078067'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3141920704401078067'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2009/05/scheduling-in-heterogeous-grid.html' title='Scheduling in a heterogeneous grid computing environment - Some thoughts'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2243267554730894335</id><published>2009-04-16T10:48:00.001-07:00</published><updated>2009-04-16T12:05:15.579-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='amd'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='&quot;live migration&quot;'/><category scheme='http://www.blogger.com/atom/ns#' term='Dell'/><category scheme='http://www.blogger.com/atom/ns#' term='Virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='kvm'/><category scheme='http://www.blogger.com/atom/ns#' term='XEN'/><category scheme='http://www.blogger.com/atom/ns#' term='Intel'/><title type='text'>Xen, KVM and VMWare comparisons</title><content type='html'>We started using VMware experimentally several years for minor development projects, but switched to XEN when it became supported by Red HAT last year. We have been moving slowly on a plan to migrate many of our servers onto XEN, so that we have the ability to move workloads around if we need to rebuild a machine, add new hardware etc. &lt;br /&gt;&lt;br /&gt;Since we have a heterogeneous environment, with grid processing spread across machines from Dell 1650s to Dell 2950s, as well as some other oddball machines, we soon ran into problems. Out initial implementations seemed fine, and we soon had several virtual machines running on different hosts, and could shut the machines down and restart them on another host. But problems quickly developed when we tried to do a &lt;a href="http://www.linux.com/articles/55773"&gt;"live" migration&lt;/a&gt;, i.e move a running machine from one physical host to another. We soon fixed some of the obvious problems, like the hosts not having the same network bridges etc., but still migration would fail. After more research and more experimentation, it became clear that XEN is very picky about the hosts that one uses for live migration. More research revealed that VMware has similar problems. &lt;br /&gt;&lt;br /&gt;Basically, since both XEN and VMWare are hypervisors which run on the bare metal, the underlying CPU hardware has to be VERY similar (hopefully identical) for live migration to work. (Dell has a very nice table on their support site showing what Dell systems can support live migration to what other Dell systems. The matrix is pretty sparse. In short you can't be sure that you can live migrate a workload from one Dell 1950 to another unless they have the same CPU set, similar network setups, shared storage etc.&lt;br /&gt;&lt;br /&gt;In addition, right after we started experimenting with XEN, Red Hat adopted KVM as their virtualization platform, and started devoting most of their resources to KVM. Hence the XEN support on Red Hat (or CENTOS which we run) is a year out of date.&lt;br /&gt;&lt;br /&gt;KVM is not quite as mature as XEN, but has a lot of effort behind it. VMware is much more mature than either alternative, but has similar problems to XEN when it comes to live migration. Since we are not close to having all of our hardware being identical, KVM seems the only avenue. It however MUST have hardware virtualization capabilities in the processors it runs on. Although that rules out some of our servers, all of the more recent servers pass that test. KVM has the advantage that since it uses LINUX as the virtualization support, much of the really detailed hardware dependencies are hidden from the client VMs, hence live migration can be supposedly be done between dissimilar hosts (i.e INTEL &lt;==&gt; AMD).&lt;br /&gt;&lt;br /&gt;Performance appears to be as good or better than XEN, so that is the new direction we are heading. Our goal is to provide a "Stern Compute Cloud" for researchers. Special purpose machines as well as grid processing machines will all run in the cloud. New physical hosts will just provide more compute power to the cloud, and can be added at will without disrupting the current work load. Once a machine is up and tested, we can just live migrate running workloads onto it. &lt;br /&gt;&lt;br /&gt;I'll keep you posted on the results.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2243267554730894335?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2243267554730894335/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2243267554730894335' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2243267554730894335'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2243267554730894335'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2009/04/xen-kvm-and-vmware-comparisons.html' title='Xen, KVM and VMWare comparisons'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8603099037012353006</id><published>2009-02-17T13:39:00.000-08:00</published><updated>2009-02-17T14:37:51.753-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='random number generation'/><category scheme='http://www.blogger.com/atom/ns#' term='Splus'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><title type='text'>Splus, Grid Computing and random numbers</title><content type='html'>Splus has some unique problems for grid computing. One is obvious and is easy to solve.&lt;br /&gt;Splus stores it's workspace results in a "CHAPTER". If you are running in parallel, you can't have multiple runs using the same chapter. In Sun Grid Engine, the solution is to use the TMPDIR environment variable which guarantees a unique temporary folder for each run. You can see an example of how to do it &lt;a href="http://pages.stern.nyu.edu/~nwhite/scrc/Splusjobs.html"&gt; Here &lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The second problem, which I just discovered, is that the Splus set.seed function only generates 1024 (not the 1000 stated in the documentation) unique sequences. Each sequence appears to be a 2^22 non-overlapping section of a 2^32 long sequence. For instance, a set.seed(0) and a set.seed(1024) are equivalent. Not a big problem in a serial computing environment, but a nasty problem in a grid environment where you may be "unrolling" Monte Carlo simulations, by making each trial be a separate run and using the run number (or task number, SGE_TASK_ID) to set the seed. You may think that you are getting, say 5000 trials, but in fact only 1024 of them will be unique and the rest will be duplicates.&lt;br /&gt;&lt;br /&gt;The problem is made worse by the there being only one sequence of 2^32. For now, our solution is just a hack, until we or someone else can dig into the Splus algorithm and find a better way.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Solution &lt;/span&gt;(inefficient, but should work).&lt;br /&gt;Say that the maximum number of random numbers that you generate in a single run is N.&lt;br /&gt;If you are on the Ith run, and I is greater than 1024, generate N * floor(I/1024) random numbers and throw them away.&lt;br /&gt;&lt;br /&gt;(pseudo code, almost certainly doesn't work)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;set.seed(I)&lt;br /&gt;JJ = floor(I/1024)&lt;br /&gt;&lt;br /&gt;if JJ &gt;0 then&lt;br /&gt;do&lt;br /&gt;junk=mean(rand(zeros(N*JJ)));&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;A much better solution would be to have a set.seed function that is smart enough to not cycle after 1024 sequences..&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8603099037012353006?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8603099037012353006/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8603099037012353006' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8603099037012353006'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8603099037012353006'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2009/02/splus-and-grid-computing-and-random.html' title='Splus, Grid Computing and random numbers'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-9051910187683488191</id><published>2009-02-10T14:13:00.001-08:00</published><updated>2009-02-10T14:42:33.856-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Virtual Machines'/><category scheme='http://www.blogger.com/atom/ns#' term='HPC'/><category scheme='http://www.blogger.com/atom/ns#' term='cluster computing'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='XEN'/><title type='text'>Virtual Machines,  HPC and local cloud computing</title><content type='html'>Virtualization may not seem to be relevant to HPC, but it offers some very nice capabilities. We are currently experimenting with XEN on the new nodes of our heterogeneous cluster. The plan is that all of the new servers will run XEN and all of the compute nodes will be run in DOM U virtual machines which can be moved from one  physical machine to another WHILE they are running. Then, if a node has a problem, we can move the workload off, fix the problem and move the workload back. XEN can do this in about 100 milliseconds by first copying the whole memory image, and then doing a final resync after stopping the virtual machine. &lt;br /&gt;&lt;br /&gt;A second option with virtualization is to allow users to "build" their own machines, i.e. a local cloud environment. There can be a standard set of choices, Linx, WinXP Win 2007 server, .... A user fills out a form, specifies their configuration and how long they need it, and they get an email describing how to access it. Actually quite easy to do.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-9051910187683488191?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/9051910187683488191/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=9051910187683488191' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9051910187683488191'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9051910187683488191'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2009/02/blog-post.html' title='Virtual Machines,  HPC and local cloud computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2895911200706726242</id><published>2008-04-05T08:09:00.000-07:00</published><updated>2008-04-05T11:22:58.991-07:00</updated><title type='text'>Hadoop and compressed files</title><content type='html'>I have spent a lot of time in the last month experimenting with my hadoop cluster. I was having very odd results until I discovered some idiosyncracies in how hadoop processes compressed files. (All of our option data files are compressed using the Windows winzip utility). I had just copied in a days worth of option quotes and trades into hadoop and was trying to develop a simple report, basically just total trade volume by symbol. Should have been easy, since hadoop supposedly supports compressed files, and will uncompress them automatically.  Not so simple....&lt;br /&gt;&lt;br /&gt;Hadoop DOES support compressed files, but only gzipped files. I finally discovered this by digging through the documentation on the java libraries shipped with hadoop. That is one gotcha. The second, is that hadoop can only uncompress a file on one node, since the java code needs to read the whole compressed file to uncompress it. (This is puts some constraints on where hadoop can process the file, and how many nodes it can use to process it. The OPRA data comes as 24 compressed files (to keep each file below 4GB , I think). That wouldn't be too bad if hadoop could actually read the compressed files, but now we will need to uncompress and recompress using gzip.&lt;br /&gt;&lt;br /&gt;But, the good news is that an experiment on the recompressed option data indicates that I can run a report on a whole days data in only a few minutes. Since our plan is to clean the data by eliminating quotes that are not best bid or ask, we will need to uncompress and recompress anyway. We are close to having our first live hadoop application.&lt;br /&gt;&lt;br /&gt;The application is trivial, in the map step of hadoop, we just filter out all of the trade data, and cut out the symbol field and the volume field. This information is sent to the reduce step, which just adds up the volumes. (Actually, the reduce step is just a 'cat' command, that puts all of the mapfiles together and sorts them by symbol). Then a small python program just aggregates by symbol. For some reason, running the python program as the reduce step doesn't work. Not sure why yet.&lt;br /&gt;&lt;br /&gt;Here is the mapper shell program that will be run on every node against the input data. Note, that hadoop is automatically unzipping the files.&lt;br /&gt;&lt;br /&gt;&lt;&lt;br /&gt;#!/bin/sh&lt;br /&gt;#&lt;br /&gt;# gawkisemapper.sh&lt;br /&gt;#&lt;br /&gt;# this is a test map for the streaming hadoop support&lt;br /&gt;# all it is going to do is to grep for trades i.e. column 2 = "a"&lt;br /&gt;# and cut out the symbol&lt;br /&gt;#&lt;br /&gt;# Note, it depends on HADOOP correctly unzipping the files and&lt;br /&gt;# sending the output to STDOUT&lt;br /&gt;#&lt;br /&gt;# print out trade symbol and volume for all trades&lt;br /&gt;# reduce step will sum up volumes&lt;br /&gt;# it does several things&lt;br /&gt;# 1) translates all cntrl character to NEWLINES (original input is in IP packets)&lt;br /&gt;# 2) Pulls out all records that are record type " a"  (option trades)&lt;br /&gt;# 3) Uses gawk to pull out the symbol, option month and year fields and volume fields (cc 19-23, 26, 27-27 and cc 37-42) and put a tab between them&lt;br /&gt;# 4) Lastly concatenate the symbol, month, year fields together and change all blanks to "_" so they appear as one field&lt;br /&gt;#&lt;br /&gt;#  Output will be something like&lt;br /&gt;# BVD__O8\t4500     (i.e Bear Stearns March 08 put, 4500 contracts)&lt;br /&gt;&lt;br /&gt;  tr -s '[:cntrl:]' '\n'|\&lt;br /&gt;  grep " a"|\&lt;br /&gt;  gawk -F "" '{print $19$20$21$22$23$26$27"\t"$37$38$39$40$41$42}'|\&lt;br /&gt;  sed -e 's/ /_/g'&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The code to run the whole process is:&lt;br /&gt;&lt;br /&gt;#!/bin/sh&lt;br /&gt;#&lt;br /&gt;# runstreaming.sh&lt;br /&gt;#&lt;br /&gt;export HADOOP_HOME="/rnddata/hadoop"&lt;br /&gt;#&lt;br /&gt;# run a map reduce on gzipped Feedcapture files to count the volume of trades by symbol in a given day&lt;br /&gt;#&lt;br /&gt;# data is in ise/2005/2005MMDD/*.gz&lt;br /&gt;#&lt;br /&gt;$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.16.0-streaming.jar \&lt;br /&gt; -input ise/2005/200505/20050527/*.gz  \&lt;br /&gt; -output ise/output/20050527/  \&lt;br /&gt; -mapper $HADOOP_HOME/nhw/gawkisemapper.sh  \&lt;br /&gt; -reducer "cat"  -jobconf "stream.recordreporter.compression=gzip" \&lt;br /&gt; -numReduceTasks 1&lt;br /&gt;#&lt;br /&gt;echo DONE&lt;br /&gt;#&lt;br /&gt;# When done, run&lt;br /&gt;#bin/hadoop dfs -cat ise/output/20050527/*|reduce.py &gt; tradebysymbol.dat&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2895911200706726242?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2895911200706726242/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2895911200706726242' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2895911200706726242'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2895911200706726242'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/04/hadoop-and-compressed-files.html' title='Hadoop and compressed files'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-4099031660730966264</id><published>2008-04-05T07:55:00.000-07:00</published><updated>2008-04-05T08:08:19.959-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='options data'/><category scheme='http://www.blogger.com/atom/ns#' term='Splus'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='volatility'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Grid Engine'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Computer'/><category scheme='http://www.blogger.com/atom/ns#' term='simulation'/><category scheme='http://www.blogger.com/atom/ns#' term='statistical computing'/><category scheme='http://www.blogger.com/atom/ns#' term='matlab'/><category scheme='http://www.blogger.com/atom/ns#' term='Apple'/><category scheme='http://www.blogger.com/atom/ns#' term='Centos'/><category scheme='http://www.blogger.com/atom/ns#' term='64 bit linux'/><title type='text'>64 bit computing  - Upgrade to Matlab 64 bit</title><content type='html'>We are finally being pushed into the world of 64 bit computing. I am surprised it took us so long, but we now have several projects that need to reference arrays larger than 4 GB.&lt;br /&gt;So far, two of our nodes, a sun solaris V240 and a custom dual processor Opteron are running a 64 bit operating system. We have already upgraded to the 64 bit version of Matlab on those nodes, and will follow with Splus. Within a few weeks, we should have all of our 64 bit capable nodes upgraded to the 64 bit version of Centos 5.1 (A Red Hat Linux clone).&lt;br /&gt;&lt;br /&gt;We have also inherited a 5TB Apple Xserve array from a discontinued project, with dual fiber connections. After some experimenting, we discovered that the fiber card in the Apple G5 that the array was attached to was just a standard LSI logic card. &lt;br /&gt;Soon, it was installed in one of our Linux systems, so we will have just added another 5TB of storage. Just in time, as it turns out, since our OPRA options data feed is now generating 150GB of data a night (50GB) compressed. This is up from a normal of about 20GB compressed. The stock market volatility really shows up in the quotes in the options data.  2 years ago we would only be getting around 10GB per night. I will talk more about problems in processing this data in a future post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-4099031660730966264?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/4099031660730966264/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=4099031660730966264' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4099031660730966264'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4099031660730966264'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/04/64-bit-computing-upgrade-to-matlab-64.html' title='64 bit computing  - Upgrade to Matlab 64 bit'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-4833907145692465321</id><published>2008-03-04T14:43:00.001-08:00</published><updated>2008-03-04T14:50:36.596-08:00</updated><title type='text'>Some tips on installing Hadoop</title><content type='html'>So that now that my HADOOP cluster is up and running, I can try to see what the problems were that others might run into. All in all, it went pretty smoothly.&lt;br /&gt;&lt;br /&gt;The one thing that was most confusing was that many of my machines have multiple nics, and we refer to the machines with different names depending on which nic we wanted to access. This really confused HADOOP, since it depends on every machine being able to talk  to every other machine. I had to go through my hosts file and make sure that every machine had one name, and one IP address. Before I did that, HADOOP would hiccup and burb as it tried to run Map Reduce jobs. I finally discovered it was moving the map jobs from one machine to another as they failed and were restarted. It would eventually finish, but only after a long time. Once I fixed the hosts file, it would chug right through large jobs without a problem.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-4833907145692465321?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/4833907145692465321/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=4833907145692465321' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4833907145692465321'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/4833907145692465321'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/03/some-tips-on-installing-hadoop.html' title='Some tips on installing Hadoop'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-801742556339927428</id><published>2008-03-01T16:06:00.001-08:00</published><updated>2008-03-04T06:22:52.524-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='map reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='cluster computing'/><category scheme='http://www.blogger.com/atom/ns#' term='rdbms'/><title type='text'>Map Reduce Controversy</title><content type='html'>Just as I am in the finishing touches of getting my hadoop cluster working, and am starting to load real data into it, I discover that there is now a major controversy developing over the whole map reduce framework. It started on the Michael Stonebreaker and others blog called &lt;a href=http://www.databasecolumn.com/2008/01/mapreduce-a-major-step-back.html&gt;"The Data Base Column" (column, get it??)&lt;/a&gt;&lt;br /&gt;and was responded to (in at least one blog) &lt;a href=http://scienceblogs.com/goodmath/2008/01/databases_are_hammers_mapreduc.php&gt;here&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Basically Stonebreaker et al. argue that the map reduce modelis simplistic, not new etc.. &lt;br /&gt;&lt;br /&gt;But the comment that REALLY got me was when they criticized the Map Reduce model for not supporting indexes, and forcing table scans to retrieve data. They really don't get it. Any time you need to look at over 10% of a table, table scans are faster than using indexes. Today's systems can scan a table very  fast, since CPU and memory speeds are increasing much,much faster than disk random access speeds. Map Reduce (and hadoop)  is not a technology in competition with relational data bases, but a technology that was designed to scale forever (or at least a lot). Let's see any RDMS run on 10,000 servers.&lt;br /&gt;&lt;br /&gt;No one is suggesting that the Map Reduce model replace RDMS's for transaction and business data processing applications (well not yet), but for write once, read many applications such that we see in social science research and on the internet, map reduce looks very nice.&lt;br /&gt;&lt;br /&gt;I will know more very soon, since I am now moving a large amount of data from different projects into our hadoop cluster. Stay tuned...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-801742556339927428?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/801742556339927428/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=801742556339927428' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/801742556339927428'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/801742556339927428'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/03/map-reduce-controversy.html' title='Map Reduce Controversy'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7128362426704180167</id><published>2008-02-28T06:36:00.000-08:00</published><updated>2008-02-28T06:40:01.438-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='S3'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='EC2'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon Elastic Compute Cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='Simple Storage Systems'/><title type='text'>Amazon EC2 and Hadoop</title><content type='html'>&lt;a href=http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873&gt;Guide to running HADOOP on Amazon&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Turns out that Amazon has an HADOOP image that is ready to run on EC2 and S3. They make it very simple, and even have a sample application that parses web logs. They can process 100GB of web logs in 35 minutes at a cost of $2.00 (not counting Storage and data transfer in and out)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7128362426704180167?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7128362426704180167/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7128362426704180167' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7128362426704180167'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7128362426704180167'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/amazon-ec2-and-hadoop.html' title='Amazon EC2 and Hadoop'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2978802002788174892</id><published>2008-02-25T11:41:00.001-08:00</published><updated>2008-02-25T12:00:30.352-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='NFS'/><category scheme='http://www.blogger.com/atom/ns#' term='HPC'/><category scheme='http://www.blogger.com/atom/ns#' term='pnfs'/><title type='text'>pNFS - Yet another approach to parallel file systems</title><content type='html'>&lt;a href=http://www.acmqueue.org/modules.php?name=Content&amp;pa=showpage&amp;pid=503&amp;page=1&lt;br /&gt;&gt;ACM Article on pNFS&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Has a description of a new version of NFS (Network File System) that incorporates a metadat server that know hwich server has what portions of a file. This allows clients to access data in parallel from multiple servers (like Lustre). Should be a nice performance boost for NFS.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2978802002788174892?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2978802002788174892/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2978802002788174892' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2978802002788174892'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2978802002788174892'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/pnfs-yet-another-approach-to-parallel.html' title='pNFS - Yet another approach to parallel file systems'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-6606562444458712461</id><published>2008-02-25T07:01:00.000-08:00</published><updated>2008-02-25T11:45:23.229-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP Streaming'/><title type='text'>HADOOP and Streaming Support</title><content type='html'>One of the neat features about HADOOP appears to be it's support for  &lt;a href=http://hadoop.apache.org/core/docs/current/streaming.html&gt; Streaming. &lt;/a&gt;&lt;br /&gt;&lt;br /&gt;By that they mean the ability to use any unix command that can act like a filter (i.e. take input from STDIN and to something to it and generate output on STDOUT), to do either the MAP or REDUCE processing. This means that some systems which were originally designed to run in a sequential mode  can automatically be parallelized.&lt;br /&gt;Cool.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-6606562444458712461?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/6606562444458712461/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=6606562444458712461' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6606562444458712461'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6606562444458712461'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/hadoop-and-streaming-support.html' title='HADOOP and Streaming Support'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7255542313609109252</id><published>2008-02-25T06:45:00.000-08:00</published><updated>2008-02-25T06:55:02.424-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DRM'/><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='Portable Batch System'/><category scheme='http://www.blogger.com/atom/ns#' term='HPC'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Grid Engine'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Computer'/><category scheme='http://www.blogger.com/atom/ns#' term='Torque'/><category scheme='http://www.blogger.com/atom/ns#' term='PBS'/><category scheme='http://www.blogger.com/atom/ns#' term='Distributed Resource Manager'/><title type='text'>HADOOP and Sun Grid Engine</title><content type='html'>http://blogs.sun.com/ravee/entry/creating_hadoop_pe_under_sge&lt;br /&gt;&lt;br /&gt;Has  a step by step example of how to modify HADOOP and Sun Grid Engine so they play together nicely. The HADOOP distribution comes with a component called HOD (HADOOP on Demand), that will automatically configure a set of nodes to run the Torque grid scheduling software so that users can dynamically request a HADOOP configuration to do processing. Torque and Sun Grid Engine are fairly similar, so it shouldn't be a surprise to see how to use SGE to schedule HADOOP jobs. Since Sun Grid Engine and Torque are two of the most popular DRMs (Distributed Resource Managers) around, this is good news.&lt;br /&gt;&lt;br /&gt;What is important here, is that medium and large HPC environments can now be used for parallel processing of io intensive jobs, and not just compute intensive jobs. The bottleneck will be whether or not each node has its own dedicated disk(s), so HADOOP can distribute the load nicely.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7255542313609109252?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7255542313609109252/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7255542313609109252' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7255542313609109252'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7255542313609109252'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/hadoop-and-sun-grid-engine.html' title='HADOOP and Sun Grid Engine'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8926285118189315275</id><published>2008-02-24T07:32:00.000-08:00</published><updated>2008-02-24T07:59:15.166-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='cluster computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Google'/><category scheme='http://www.blogger.com/atom/ns#' term='XEN'/><category scheme='http://www.blogger.com/atom/ns#' term='social networks'/><category scheme='http://www.blogger.com/atom/ns#' term='VDOOP'/><title type='text'>HADOOP up and running at the Stern Center for Research Computing</title><content type='html'>As you can see from my recent posts, I am excited about HADOOP(Open source version of the GOOGLE tools), and how it might be used for research computing at the Stern School for Business. We already have a small high performance computing cluster (HPC) to handle the computationally intensive research that many of our faculty do. A HADOOP cluster would allow us to attack  research applications that demand large amounts of file processing with a tested and powerful set of tools. YAHOO just announced that they had put a very large HADOOP cluster into production to produce their "page map". The HADOOP cluster ran their application on the same hardware as their proprietary solution (Dreadnought), but 1.5 times faster.&lt;br /&gt;&lt;br /&gt;While social science research, such as is done at business schools, is not in that class yet, there are many application areas which need a better architecture for handling their data. These include almost any kind of large social network analysis; merging the large economic, financial , government and news data that universities have access to; as well as more flexible ways for storing and analyzing the rapidly growing amount of textual data available.&lt;br /&gt;&lt;br /&gt;HADOOP seems to be the best option available, so I decided to investigate how hard it would be to install on some of the machines in our cluster. &lt;br /&gt;&lt;br /&gt;My first attempt was a failure, as I tried to install VDOOP, a virtual set of XEN nodes running HADOOP.  VDOOP needs to run on top of XEN, which is available in the latest linux kernels. So, I downloaded the VDOOP rpm, and installed it on one of my servers. MISTAKE! The server was immediately put out of commission and no one except root could login to it. Permission denied messages for all non-root users. I eventually traced the problem to  "/","/usr","/usr/bin", and "/usr/lib" all having their permissions changed. A chmod 755 fixed them. Turns out that there is a fair amount of work to get XEN working, and it wasn't going to be compatible with the way our Sun Grid Engine cluster was configured. Since I was now wary of VDOOP,  I decided to just install HADOOP directly.&lt;br /&gt;&lt;br /&gt;To my delight, it was up and running in under an hour (including downloading it), and able to process test jobs. &lt;br /&gt;Since it is all java, you just need to point it to a jdk1.5 distribution and go. By default it puts all of its files in /tmp.&lt;br /&gt;&lt;br /&gt;Today, I plan to add more nodes and more disk, and try something a little more stressful (like parsing the 150GB of options data we get every night, or merging some of our economic and financial information with the results of some web crawling). I can easily have  about a TB of space spread across multiple nodes to test with. I'll post some results here.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8926285118189315275?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8926285118189315275/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8926285118189315275' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8926285118189315275'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8926285118189315275'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/hadoop-up-and-running-at-stern-center.html' title='HADOOP up and running at the Stern Center for Research Computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-1206230983933856621</id><published>2008-02-21T12:02:00.000-08:00</published><updated>2009-02-10T14:41:00.490-08:00</updated><title type='text'>VDOOP - A virtual HADOOP</title><content type='html'>Just when I thought I had found everything, I discovered this. VDOOP is a Virtual HADOOP. It will allow you to set up a pilot version of HADOOP&lt;br /&gt;on a linux server that supports XEN. You just specify how many virtual servers you want, as well as some other info, and BOOM, you have a HADOOP. I need to try it (soon...).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-1206230983933856621?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/1206230983933856621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=1206230983933856621' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1206230983933856621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1206230983933856621'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/vdoop-virtual-hadoop.html' title='VDOOP - A virtual HADOOP'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-1422047176485917954</id><published>2008-02-19T12:03:00.000-08:00</published><updated>2008-02-19T12:06:16.897-08:00</updated><title type='text'>Pig Latin for HADOOP</title><content type='html'>Pig is the HADOOP project which has developed a high level data filtering and analysis system. It has it's own language (Pig Latin). One could think of it as a high level relational language that uses HADOOP as it's processing system.&lt;br /&gt;You can find out more at &lt;br /&gt;&lt;a href=http://wiki.apache.org/pig/&gt;PIG Project&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-1422047176485917954?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/1422047176485917954/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=1422047176485917954' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1422047176485917954'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1422047176485917954'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/pig-latin-for-hadoop.html' title='Pig Latin for HADOOP'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-6575197044316389618</id><published>2008-02-19T11:42:00.000-08:00</published><updated>2008-02-19T11:49:09.175-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='search engines'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft Yahoo merger'/><category scheme='http://www.blogger.com/atom/ns#' term='Yahoo'/><title type='text'>Yahoo announces the worlds largets production HADOOP application</title><content type='html'>HADOOP apparently now scales to 10,000 nodes and has become a core part of Yahoo's infrastructure.&lt;br /&gt;&lt;br /&gt;&lt;a href=http://developer.yahoo.com/blogs/hadoop/2008/02/yahoo-worlds-largest-production-hadoop.html&gt;Yahoo announcement&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Since there are very few organizations in the world that have more than 10000 node HPC systems, this sounds pretty good.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-6575197044316389618?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/6575197044316389618/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=6575197044316389618' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6575197044316389618'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6575197044316389618'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/yahoo-announces-worlds-largets.html' title='Yahoo announces the worlds largets production HADOOP application'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7511717957253396365</id><published>2008-02-18T07:51:00.000-08:00</published><updated>2008-02-18T20:40:03.881-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='HADOOP'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsft'/><category scheme='http://www.blogger.com/atom/ns#' term='distributed computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Microsoft Yahoo merger'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Yahoo'/><category scheme='http://www.blogger.com/atom/ns#' term='Apache'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon Elastic Compute Cloud'/><title type='text'>Parallel Computing and the Hadoop Project, Google, Yahoo, and Microsoft</title><content type='html'>As many of you probably know, Google has developed it's own custom infrastructure, designed to support the massive demands it has for computation. It includes the Google Clusters, the Google File System, Map Reduce, Sawszall, and BigTable to name a few components. All of these are designed to handle the many petabytes of data that Google has to process each day, including not just supporting its crawling and indexing activities, but other products like Google Analytics, Blogger, AdWords, AdSense, Google Maps, ...&lt;br /&gt;&lt;br /&gt;A simplistic description of the capability would be to take the unix grep and sort and uniq filters and have them work in a distributed environment over 1000s of computers and petabytes of data. BigTable adds a relational-like data base capability. (HADOOP has a BigTable clone called HBASE, you can read about it&lt;a href="http://wiki.apache.org/hadoop/Hbase"&gt; here&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;Some of us view that infrastructure as Google's key competitive advantage, since it uses it in a variety of different ways. It can build new products on top of it quickly, and use it to link it's many projects together. This is an important capability in this networked world. By combining information from their many products and services, they can add increasing value due to the network externalities involved.&lt;br /&gt;&lt;br /&gt;There are now university courses  devoted to how Search Engine Companies like &lt;a href="http://www.google.com"&gt;Google&lt;/a&gt;, &lt;a href="http://www.yahoo.com"&gt;Yahoo&lt;/a&gt; and &lt;a href="http://www.microsoft.com"&gt;Microsoft&lt;/a&gt; are extending Search into many new areas.&lt;br /&gt;(For example, the &lt;a href="http://www.stern.nyu.edu"&gt;Stern School of Business&lt;/a&gt; at &lt;a href="http://www.nyu.edu"&gt;New York University&lt;/a&gt; has two new courses named "&lt;span style="font-weight:bold;"&gt;Search and the New Economy&lt;/span&gt;" being taught at the Stern School, one by me (&lt;a href="http://www.stern.nyu.edu/~nwhite"&gt;undergraduate&lt;/a&gt;), and one by Panos Ipeirotis (&lt;a href="http://www.stern.nyu.edu/~panos"&gt;MBA&lt;/a&gt;)). The courses focus on how Search Engines and related technologies are transforming business. John Battelle's book, "Search" is a good place  to get an overview of the issues.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;But why is this of interest to the research computing community? &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In 2004, an interesting effort was started to replicate the Google toolset in an open source environment. This project was called HADOOP, and it grew quickly. In 2006, Yahoo joined the project (if you can't beat them, join them) and began to move it's infrastructure onto HADOOP. You can see the timeline here, at the &lt;a href="http://developer.yahoo.net/blog/archives/2007/07/yahoo-hadoop.html"&gt;Yahoo Blog on Hadoop.&lt;/a&gt; You can see an interesting video describing HADOOP &lt;a href="http://us.dl1.yimg.com/download.yahoo.com/dl/ydn/eric14_ipod.m4v"&gt;here.&lt;/a&gt; Yahoo began to see an order of magnitude increase in sort timings (one of the basic operations necessary for Search). &lt;br /&gt;&lt;br /&gt;In 2007, the Apache Software Foundation adopted HADOOP as a top-level project, which means it is now being supported by developers all over the world, and will fit into the Apache project.&lt;br /&gt;&lt;br /&gt;Meanwhile, there are a number of related projects which are built on top of HADOOP, including &lt;a href="http://lucene.apache.org/java/"&gt;Lucene Java&lt;/a&gt;, an open source java search engine Library, &lt;a href="http://lucene.apache.org/nutch/"&gt;NUTCH&lt;/a&gt; a Search Application built on top of Lucene, &lt;a href="http://lucene.apache.org/mahout/"&gt;MAHOUT&lt;/a&gt;, a scalable machine learning suite of libraries, &lt;a href="http://incubator.apache.org/tika/"&gt;TIKA&lt;/a&gt;, a library for parsing and extracting data from text documents.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Why is this important?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;What is of interest here, is these tools can dramatically increase the ability to do high end business research projects, which have been difficulty up to now. We have researchers building machine learning applications, crawling the web for information about business topics, extracting data from public documents like SEC filings, building customized search engines for research uses, ...&lt;br /&gt;&lt;br /&gt;In general, the HADOOP and related projects now extend grid and cluster computing concepts into the text data mining and analysis arena, beyond strictly scientific computing applications. &lt;br /&gt;&lt;br /&gt;In an earlier post, I have described how the Stern Center for Research Computing has been experimenting with the Lustre File System. One of the applications is processing the large amount of data that we receive every night from the Options Price Reporting Authority (OPRA). Our current approach parses the data using the unix awk and grep commands, then sorts it. This all be could be done directly in the HADOOP environment, and would presumably scale linearly with the number of nodes. &lt;br /&gt;&lt;br /&gt;These tools should be able to drop into a cluster environment easily, or along with&lt;br /&gt;Amazons EC2 and S3 services, (See this &lt;a href="http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=112&amp;externalID=873"&gt;article on how to do this)&lt;/a&gt; could be dynamically used remotely as projects need resources. This is very exciting, and now extends the capabilities of world class infrastructures down to small research projects and small businesses.  This could be the beginning of a whole new approach to computing.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;And what are the implications for the planned Microsoft, Yahoo merger?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;What is even less clear now, is how a Microsoft-Yahoo merger can work given Yahoo's commitment to the HADOOP/GOOGLE architecture. On the other hand, a GOOG-YHOO merger is a slam dunk, since their back end architectures are now very similar, and their researchers and developers speak the same language. If the MSFT, YHOO merger goes through, I think one could expect many of the Yahoo developers to vote with their feet and move to Google.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7511717957253396365?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7511717957253396365/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7511717957253396365' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7511717957253396365'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7511717957253396365'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/parallel-computing-and-hadoop-project.html' title='Parallel Computing and the Hadoop Project, Google, Yahoo, and Microsoft'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-5560523126697019734</id><published>2008-02-10T07:37:00.001-08:00</published><updated>2008-02-10T15:20:15.011-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ओएप्शन'/><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='cluster file system'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='quotes'/><title type='text'>Update on the Lustre Cluster File System Project</title><content type='html'>We finally found some time to spend on the cluster file system (&lt;a href="http://www.lustre.org"&gt;Lustre&lt;/a&gt;, now owned by &lt;a href="http://www.sun.com"&gt;SUN&lt;/a&gt;) here at the &lt;a href="http://www.stern.nyu.edu"&gt;NYU Stern&lt;/a&gt; &lt;a href="http://www.stern.nyu.edu/scrc"&gt;Center for Research Computing&lt;/a&gt;. I have added some more disk, and we have run more performance tests. So far out of our 5 node &lt;br /&gt;cluster, 4 nodes are actually working (one node has a bad nic), and providing about 140MB/sec aggregate throughput. Doesn't seem like all that much, but remember this is just a test going against slow (40 MB max) drives, so the max aggregate bandwidth would be 160MB. When I get the broken node back on-line, we should go up to about 175MB sustained write speeds. My timings have been done from two client nodes, 1 with a 1gb nic, and the other with 2 1gb nics, channel bonded.&lt;br /&gt;&lt;br /&gt;To our surprise, one of the limiting factors is CPU. Even though these are very fast 3.73 Ghz Intel xeon processors (&lt;a href="http://www.dell.com"&gt;Dell&lt;/a&gt; 1950s), it takes about 15 seconds of cpu to write a 1 GB file (dd if=/dev/zero ... bs=4096 count=25000). If you do the math, you will see that 1 process can only write 67MB/ sec.&lt;br /&gt;&lt;br /&gt;The next limiting factor for throughput from a single machine, is the nic. a 1 gb nic seems to only give about 750mb of throughput, or a little under 100MB (93.75) per second. So even with multiple processes running on 1 machine, we couldn't get to the aggregate throughput. When we ran the tests (4 1GB dd's) from the machine with the 2 nics, we could finally hit the 133MB aggregate rate, as we did when we ran the tests from 2 client machines.&lt;br /&gt;&lt;br /&gt;In any  event, we are ready  to put the cluster file system into production, as soon as we replace the bad nic. At that point, every machine in our grid will look as if it had (at a minimum) a 97MB/ sec drive. Our dual ported systems should be able to write at about 175MB/sec to a 1.5TB file system. Not too shabby. &lt;br /&gt;&lt;br /&gt;One of the applications we need this amount of bandwidth for, is processing the large amount of OPRA (&lt;a href="http://www.opradata.com"&gt;Options Price Reporting Authority)&lt;/a&gt; Options data we have been collecting for the last several years for the &lt;a href="http://www.stern.nyu.edu/salomon"&gt;Salomon Center&lt;/a&gt; here at Stern. The data includes every information on every quote and trade for all options on all exchanges. There are over 3000 option symbols traded on any day, with quotes changing constantly. The data is sent to us from the ISE (&lt;a href="http://www.iseoptions.com/"&gt;International Securities Exchange&lt;/a&gt;) every night as 24 compressed files. It is now close to 50GB of data a night, 150 when it is uncompressed. We need some temporary space to process the data, and we need to access it from any number of machines (dozens). NFS becomes an immediate bottleneck.&lt;br /&gt;&lt;br /&gt;As we go forward, we can upgrade out small Lustre system and replace the OST's  (Object Storage Targets) with Raid Devices. (We currently use Western Scientific raid devices). Now instead of 40MB max per device, it will be closer too 200MB, and aggregate rates will jump (but the 1gb nics on the OSS's will have to be upgraded as well).&lt;br /&gt;&lt;br /&gt;It is easy to see how one could quickly build a small cluster that had a 1 TB/ second throughput rate, and was totally reliable and redundant. (Lustre allows failover of both devices and nodes, so you can loose any device or any node without a failure).&lt;br /&gt;&lt;br /&gt;If you want to see the exact commands it took to build the cluster file system, you can see them at Bu&lt;a href="http://www.stern.nyu.edu/scrc/createlustre.html"&gt;Building the  Lustre file system at Stern&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-5560523126697019734?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/5560523126697019734/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=5560523126697019734' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/5560523126697019734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/5560523126697019734'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/02/update-on-lustre-cluster-file-system.html' title='Update on the Lustre Cluster File System Project'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2429044382133098867</id><published>2008-01-21T08:04:00.000-08:00</published><updated>2008-01-23T13:29:04.842-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cluster file system'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Grid Engine'/><category scheme='http://www.blogger.com/atom/ns#' term='EC2'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon Elastic Compute Cloud'/><title type='text'>Cluster File Systems and the Amazon Elastic Cloud</title><content type='html'>Just a thought, but one could also use EC2 as both a compute facility and a cluster file server. Suppose you had an application that generates a huge amount of data that then needs to be processed. Since each EC2 machine has a minimum of 160GB of disk, one could easily configure a set of machines that were both Sun Grid Engine (or whatever) clients as well as Cluster File system targets.&lt;br /&gt;&lt;br /&gt;You provision the machines on EC2, add them to your cluster, and ship your jobs to them.&lt;br /&gt;Now you are not only adding compute power, but large amounts of disk at the same time.&lt;br /&gt;Nice..&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2429044382133098867?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2429044382133098867/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2429044382133098867' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2429044382133098867'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2429044382133098867'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/01/cluster-file-systems-and-amazon-elastic.html' title='Cluster File Systems and the Amazon Elastic Cloud'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-6580301419815885009</id><published>2008-01-21T07:47:00.000-08:00</published><updated>2008-02-18T11:33:23.030-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IBM'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon Elastic Cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='Blue Cloud'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Grid Engine'/><category scheme='http://www.blogger.com/atom/ns#' term='Open VPN'/><title type='text'>OpenVPN and The Amazon Elastic Compute Cloud</title><content type='html'>The latest issue of linuxjournal (http://www.linuxjournal.com login required), has a nice article about how to use OpenVPN to set up a lan infrastructure that can easily be extended to outside hosts (like EC2s). &lt;br /&gt;&lt;br /&gt;What is the problem?&lt;br /&gt;&lt;br /&gt;The problem is that if you want to dynamically add processing power from a bunch of hosts that are not in your  private subnet (for instance, all of our grid computers are in a 192.168.1.x private subnet), you need to make some changes&lt;br /&gt;&lt;br /&gt;1) You need to go to a 10.x.y.z class B private subnet to be able to see all of your hosts.&lt;br /&gt;2) More importantly, you need an infrastructure that allows you to directly tunnel remote hosts into your subnet, i.e. you need some type of VPN solution to get the traffic through your corporate firewall.&lt;br /&gt;&lt;br /&gt;OpenVPN seems to provide the necessary tools.&lt;br /&gt;&lt;br /&gt;The example in the article show how to use Openvpn to link three data centers together so that they all see the same static, 10.x.y.z network, even though the machines are housed at 3  different locations on the internet. Assume one "datacenter" is a set of machines in EC2 that you wish to turn on and off as necessary. This should make them appear as if they were all in the same data center, connected to a common switch.&lt;br /&gt;Very nice.&lt;br /&gt;&lt;br /&gt;The problem that we  have  at NYU Stern is that we need to switch to a class B network&lt;br /&gt;to just to be ready to do something like this. The class C subnet we are using now is already filling up fast, as we continue to add machines, many with 2 - 5 nics. (yes, we are doing channel bonding, which reduces the set of IP addresses, but we still would be incapable of just adding another 1000 hosts dynamically for an hour or so.)&lt;br /&gt;&lt;br /&gt;We'll see, I hope that someone else will do the initial work of adding Sun Grid Engine hosts using Ec2. Should be pretty straightforward.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Oh, and by the way. Amazon is not alone in offering cloud computing, here is IBM's press release on "Blue Cloud", their cloud computing initiative.&lt;br /&gt;&lt;a href="http://www-03.ibm.com/press/us/en/pressrelease/22613.wss"&gt;IBM Blue Cloud Press Release&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-6580301419815885009?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/6580301419815885009/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=6580301419815885009' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6580301419815885009'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6580301419815885009'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/01/openvpn-and-amazon-elastic-compute.html' title='OpenVPN and The Amazon Elastic Compute Cloud'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-7151772056213592301</id><published>2008-01-18T07:11:00.000-08:00</published><updated>2008-01-18T07:40:59.423-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Windows Applications'/><category scheme='http://www.blogger.com/atom/ns#' term='Stern School'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Citrix'/><category scheme='http://www.blogger.com/atom/ns#' term='remote computing'/><title type='text'>Adding Windows Based Research Applications</title><content type='html'>Many researchers prefer doing their research on their office desktop machines, rather than using our more powerful cluster. It is hard to support those users, since we can't look at their files, or see what they are doing.  In some cases they may be running anlyses from their home, or on their notebook as they travel. Software licensing and support become problems, since we need to purchase licenses  for applications that may get very little use. A recent improvement to the schools infrastructure is helping to solve many of these problems.&lt;br /&gt;&lt;br /&gt;In a move to save space and provide more flexibility for our students, Stern recently fazed out it's graduate computer labs, and replaced them with a large (5 server) Citrix installation. Citrix is an enhancement to Microsoft's terminal server environment that allows remote users to run Windows applications on other machines, as if they were sitting in front of the machine. Each user get's their own desktop, and access to their local hard drive and printer, as well as network drives. Each server can support up to 100 simultaneous users doing moderate work like Word or Excel applications. More importantly, we now have much better control over software licensing and maintenance, since we are only doing it on 5 machines instead of 70. &lt;br /&gt;&lt;br /&gt;A major result of the new environment, is that we can offer applications to users who can be anywhere in the world, an important consideration with our many global based curricula. Students and faculty do not need to install applications on their own machines, since any machine with a web browser can access the Citrix applications.&lt;br /&gt;&lt;br /&gt;What good is this for research computing? Our staff spends a good bit of time every year trying to support our researchers in the many environments they work in, at school, at home and traveling. In addition we have a continual flow of visiting scholars, faculty, guests who need access to our systems and databases. We now can provide these users with immediate access to our "facilities", even if they are not in residence locally.&lt;br /&gt;&lt;br /&gt;Given the highly cyclical nature of the demand for lab resources from our students, the server farm (APPS.STERN.NYU.EDU)is often very lightly loaded. There are some research applications like Eviews which only run in a Windows environment. Our Citrix  installation now allows researchers to run their Windows jobs remotely, on very fast and powerful machines. In the future, we may even be able to use Sun Grid Engine to schedule these applications from our grid computing environment, essentially extending that environment to support Windows apps as well as linux/unix.&lt;br /&gt;&lt;br /&gt;This should be a win for everyone involved...&lt;br /&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");&lt;br /&gt;document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));&lt;br /&gt;&lt;/script&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;var pageTracker = _gat._getTracker("UA-3420925-2");&lt;br /&gt;pageTracker._initData();&lt;br /&gt;pageTracker._trackPageview();&lt;br /&gt;&lt;/script&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-7151772056213592301?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/7151772056213592301/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=7151772056213592301' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7151772056213592301'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/7151772056213592301'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/01/adding-windows-based-research.html' title='Adding Windows Based Research Applications'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-1327341525819356287</id><published>2008-01-17T13:49:00.000-08:00</published><updated>2008-01-19T11:50:44.689-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='S3'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='EC2'/><category scheme='http://www.blogger.com/atom/ns#' term='Simple Storage Systems'/><category scheme='http://www.blogger.com/atom/ns#' term='elastic compute cloud'/><title type='text'>Cloud Computing - The Next Big Thing?</title><content type='html'>I finally took a look at Amazon's Elastic Compute Cloud service (EC2). It only took a few minutes to see how important this could become. You can put up a new server in minutes, and then replicate many of them to do things like add processing nodes to my computing grid. In fact, it may be cheaper to use the Amazon Cloud for a substantial proportion of the Center's grid computing, since you don't pay for cycles that aren't used. At the current price of 10 cents a compute hour for the equivalent of a pretty fast Xeon machine with 2GB of ram and 160GB of disk, the breakeven point is somewhere around 40% of 24x7 operation, i.e. it would only pay to actually buy a machine if it was going to be in use more than   8 hours of cpu a day. There lots of server that never hit an hour a day of CPU. In my case, the cloud could be used to handle  spikes in demand. And you get to follow the cost performance curve without having to continually invest in new equipment. The only downside is getting data into and out of the cloud.&lt;br /&gt;&lt;br /&gt;For cpu intensive grid computing, that isn't a problem. But for grinding on large data sets, the economics change and you have to sharpen the pencil. Hopefully, I will find the resources to set up  and test a cloud server add-on to our grid. If one works, so would 100.&lt;br /&gt;&lt;br /&gt;Add on to that the Amazon Simple Storage Service (S3), which is a totally reliable fast access storage system, again aggressively priced, and you have a combination that looks  hard to beat, especially for startups, or peak demand applications. Now anyone can have the kind of infrastructure that Amazon has, without all of the support overhead. (Amazon uses the Elastic Compute Cloud for its own systems).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");&lt;br /&gt;document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));&lt;br /&gt;&lt;/script&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;var pageTracker = _gat._getTracker("UA-3420925-1");&lt;br /&gt;pageTracker._initData();&lt;br /&gt;pageTracker._trackPageview();&lt;br /&gt;&lt;/script&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-1327341525819356287?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/1327341525819356287/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=1327341525819356287' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1327341525819356287'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1327341525819356287'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/01/cloud-computing-next-big-thing.html' title='Cloud Computing - The Next Big Thing?'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-1972795867082376570</id><published>2008-01-08T12:22:00.000-08:00</published><updated>2008-01-17T14:15:47.953-08:00</updated><title type='text'>Long Overdue Update</title><content type='html'>Now that the Fall term has ended, I can add a few new items. Basic changes have been the installation of yet another server, this time a dual processor quad core Xeon Dell 1950. It really cooks, providing a 50% increase in performance over our dual processor, dual core hyper threaded Dell 1950's. The quad core processors are apparently not hyper-threaded, since linux only sees 4 processors, the same as it sees on the older dual core processors. Development of our Cluster file system was stopped with the graduation of the student who had been working on it, but we have recently started working on it again. In the mean time, Sun bought the rights to the Lustre system, and are apparently planning to add it's features to ZFS. &lt;br /&gt;&lt;br /&gt;Some early benchmarks with multiple clients hitting the file system seem to corroborate the claims that it scales near linearly. We just need to add some storage arrays as data targets to see the kind of performance we need. One high risk strategy would be to put all of our storage arrays under lustre and create a high bandwidth, highly reliable cluster file system. I don't quite have enough disk to do that yet... Maybe a grant would help  out..&lt;br /&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");&lt;br /&gt;document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));&lt;br /&gt;&lt;/script&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;var pageTracker = _gat._getTracker("UA-3420925-1");&lt;br /&gt;pageTracker._initData();&lt;br /&gt;pageTracker._trackPageview();&lt;br /&gt;&lt;/script&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-1972795867082376570?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/1972795867082376570/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=1972795867082376570' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1972795867082376570'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/1972795867082376570'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2008/01/long-overdue-update.html' title='Long Overdue Update'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-3643024529543223430</id><published>2007-04-25T09:20:00.000-07:00</published><updated>2007-04-25T09:26:39.794-07:00</updated><title type='text'>Update on Lustre , the Cluster File System</title><content type='html'>Our initial tests with Lustre are very promising. It was much easier to install than we expected and appears to be able to do everything we need to do. We are running small tests in a "sandbox". We took three of our older nodes (dual processor AMD Athlons) of the grid, and are using them as the first 3 nodes and master for a baby clustered file system. If the experiment is successful, we will add scsi cards to the nodes, and attach some of the storage arrays we are rebuilding, so we have a lot of FAST, RELIABLE disk space. We may also take some less reliable individual disks and use them for fast scratch space.  This now gives us a growth path that will allow us to recycle old, slower nodes into targets on the cluster file system. For info,&lt;br /&gt;go to http://www.lustre.org&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-3643024529543223430?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/3643024529543223430/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=3643024529543223430' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3643024529543223430'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3643024529543223430'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/04/update-on-lustre-cluster-file-system.html' title='Update on Lustre , the Cluster File System'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-127219906156648167</id><published>2007-04-03T13:06:00.000-07:00</published><updated>2007-04-03T13:16:58.713-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='port trunking'/><category scheme='http://www.blogger.com/atom/ns#' term='cluster file system'/><category scheme='http://www.blogger.com/atom/ns#' term='iscsi'/><title type='text'>Cluster File System and ISCSI</title><content type='html'>We have a number of older nodes that are probably not worth upgrading anymore, so we are investigating turning them into the beginnings of a clustered file system. One plan would be to take the nodes and fill them with &lt;br /&gt;disk and then run a cluster file system so that those disks could be mounted on any machine in our cluster. Another option would be to take a lun from one of our large storage arrays, and point an iscsi target at it, and use a cluster file system to mount the LUN on all nodes. We would use port trunking to make the storage array available over a 2 - 4 Gb/sec pipe. We are just starting some timings now, should know more in a few weeks. Anyone who has had experience with this, please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-127219906156648167?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/127219906156648167/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=127219906156648167' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/127219906156648167'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/127219906156648167'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/04/cluster-file-system-and-iscsi.html' title='Cluster File System and ISCSI'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8457435914835628050</id><published>2007-03-02T08:26:00.000-08:00</published><updated>2007-03-02T08:32:43.337-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='amd'/><category scheme='http://www.blogger.com/atom/ns#' term='Floating Point'/><category scheme='http://www.blogger.com/atom/ns#' term='HPC'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='quad core'/><category scheme='http://www.blogger.com/atom/ns#' term='processor chips'/><title type='text'>New Quad Core Chips from AMD - Very fast floating point</title><content type='html'>AMD just announced a new generation of quad core chips that will be shipping in the fall of 2007. A dual processor system with these chips will appear to be a 16 processor system due to hyperthreading. ( 8 cores, 2 threads / core)&lt;br /&gt;&lt;br /&gt;Perfect for grid computing...&lt;br /&gt;&lt;br /&gt;See the article from Information week below, or go to the original article at&lt;br /&gt;&lt;br /&gt;http://update.informationweek.com/cgi-bin4/DM/y/m44X0GMQ380G4n0E56T0FX&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Looking To Best Intel, AMD Floors Quad-Core Performance&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;AMD's upcoming Barcelona quad-core processor will sport floating-point performance 42 percent higher than Intel's Xeon X5355, AMD claims.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;By Rick Merritt&lt;br /&gt;EE Times&lt;br /&gt;&lt;br /&gt;February 28, 2007 07:03 PM&lt;br /&gt;&lt;br /&gt;SAN FRANCISCO, Calif. — Advanced Micro Devices upcoming Barcelona processor will sport floating-point performance 42 percent higher than Intel's current top-of-the line CPU, the Xeon X5355 also known as Clovertown.&lt;br /&gt;&lt;br /&gt;The news marked the first performance numbers AMD has provided for the chip that packs four Opteron cores on a single die and will be in production this fall. AMD also demonstrated working versions of its next-generation graphics chip the R600 to be released by the end of June.&lt;br /&gt;&lt;br /&gt;AMD has been under pressure from archrivals Intel and Nvidia. Intel is shipping multiple quad-core processors using a system-in"package approach, claiming it has retaken the performance lead in x86 CPUs. Nvidia shipped a new generation graphics controller last fall, a move the graphics division of AMD has not yet answered.&lt;br /&gt;&lt;br /&gt;Mario Rivas, general manager of AMD's microprocessor group, said Barcelona will provide a double-digit leap in integer performance over the quad-core Xeon, though he declined to be more specific. Henri Richard, chief of sales and marketing at AMD, said Barcelona will have a significant integer performance lead over Intel's quad-core chips.&lt;br /&gt;&lt;br /&gt;Although the floating point advantage is significant, few applications outside high performance computing and video encoding make use of it. Nevertheless, analysts were positive on the news.&lt;br /&gt;&lt;br /&gt;"I thought Richard's comments were a strong vote of confidence in the product," said Nathan Brookwood, principal of market watcher Insight64 (Saratoga, Calif.).&lt;br /&gt;&lt;br /&gt;"Virtualized Web servers don't use floating-point processing, but if AMD is seeing integer performance gains over Intel in double digits, that's a positive for them," said Dean McCarron, principal of Mercury Research (Cave Creek, Ariz.), referring to the most mainstream application for Barcelona.&lt;br /&gt;&lt;br /&gt;Separately, AMD gave one of the first public demos of the R600, its next-generation graphics controller that uses 320 multiply-accumulate units. The company showed a Barcelona-based system using two 200W R600 graphics cards to hit a terabit/second benchmark.&lt;br /&gt;&lt;br /&gt;Release of the R600 has been delayed "a few weeks" so that AMD can roll out a full suite of graphics chips covering multiple market segments for the latest Microsoft DirectX 10 applications programming interface. Rival Nvidia rolled out its high-end DX10 graphics controller, the GeForce 8800 last fall but has not filled out its product line with midrange and low-end parts based on it yet.&lt;br /&gt;&lt;br /&gt;"As soon as AMD makes their DX10 announcements, I am sure we will hear about competing products from Nvidia," said McCarron.&lt;br /&gt;&lt;br /&gt;In addition, AMD announced a new desktop chip set, the first from the ATI division since the merger last fall. The AMD 690 sports an ATI Radeon X1250 graphics core and a new video decode block. It is also the former ATI's first chip set to support the HDMI video interface with HDCP copy protection for high definition video.&lt;br /&gt;&lt;br /&gt;Ten motherboard makers said they will ship as many as 30 products with the chip.&lt;br /&gt;&lt;br /&gt;"They put a lot of emphasis on home entertainment with this chip set," said McCarron. "It's a stronger graphics core than they have used in the past," he added.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8457435914835628050?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8457435914835628050/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8457435914835628050' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8457435914835628050'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8457435914835628050'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/03/new-quad-core-chips-from-amd-very-fast.html' title='New Quad Core Chips from AMD - Very fast floating point'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-6257768451877559063</id><published>2007-02-19T15:22:00.000-08:00</published><updated>2007-03-02T08:43:41.114-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='firewall'/><category scheme='http://www.blogger.com/atom/ns#' term='hosts.deny'/><category scheme='http://www.blogger.com/atom/ns#' term='ssh'/><category scheme='http://www.blogger.com/atom/ns#' term='iptables'/><category scheme='http://www.blogger.com/atom/ns#' term='hackers'/><title type='text'>Denyhosts</title><content type='html'>I have started running a python script called "denyhosts" on several of our publically available servers. It seems to work well, with a few glitches. &lt;br /&gt;&lt;br /&gt;Basically, it monitors the system log files and checks for breakin attempts. When it sees an attamept (you can control how sensitive it is), it adds the host ip address the  /etc/hosts.deny file. In addition, it will upload your hosts.deny file to a central server and download new entries from the central server. I very quickly got to over 5500 blocked IP addresses.&lt;br /&gt;&lt;br /&gt;That is the good news, the bad news is that I discovered that you really have to be careful or you can lock out legitimate machines/users. For instance, we run opennms to monitor all of our machines. Opennms tries to test ssh on every machine every 5 minutes to make sure the service is running. This ends up in the system log file /var/log/secure as an invalid login. After a number of these, the host is turned off, and it takes a fair bit of work to get it turned back on again. I eventually wrote a shell script to automate the "unblocking" of a host.&lt;br /&gt;&lt;br /&gt;It seems to be running very nicely now, and I am going to roll it out to some more systems.  One possible enhancement would be to include the ip addresses directly in the iptables firewall we run and not even let the packets through.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-6257768451877559063?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/6257768451877559063/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=6257768451877559063' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6257768451877559063'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/6257768451877559063'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/denyhosts.html' title='Denyhosts'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-9073158952859531329</id><published>2007-02-17T11:32:00.000-08:00</published><updated>2007-02-17T11:34:42.211-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='computing in higher education'/><category scheme='http://www.blogger.com/atom/ns#' term='Educause'/><title type='text'>EDUCAUSE information on Research Computing</title><content type='html'>I just found this link to a large  body of materials on research computing at Educause. Educause is the primary organization devoted to the use of computers in higher education.&lt;br /&gt;&lt;br /&gt;http://www.educause.edu/content.asp?page_id=645&amp;Parent_ID=789&amp;bhcp=1&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-9073158952859531329?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/9073158952859531329/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=9073158952859531329' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9073158952859531329'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9073158952859531329'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/educause-information-on-research.html' title='EDUCAUSE information on Research Computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-3440523177187274263</id><published>2007-02-17T06:28:00.000-08:00</published><updated>2007-02-17T06:37:17.616-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='parallel computing'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Grid Engine'/><category scheme='http://www.blogger.com/atom/ns#' term='matlab'/><title type='text'>Parallel Matlab??</title><content type='html'>Well, we had a speaker in our research seminar this week that was the first researcher who looked like he could use a version of parallel &lt;a href="http://www.matlab.com"&gt;matlab&lt;/a&gt;. His problem was so large that he could only attack little pieces at a time. Then yesterday, I was looking at the matlab code for one of our PhD students, and think that it to could be run in parallel. Until now, most of our researchers use our grid by running many copies of their code, but the runs are independent of each other, i.e. no communication between the running processes. Easy to do with Sun Grid Engine (see my earlier post). Truly running them  in parallel means that some of the large matrices that they are dealing with would be split across many machines/processors, with the code on each responsible for it's part of the matrix. There are libraries that are used in high performance computing to do this, but it one would have to recode their matlab application to use them. Matlab recently announced their version of parallel matlab. I need to take a look at it. I will report later.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-3440523177187274263?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/3440523177187274263/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=3440523177187274263' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3440523177187274263'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3440523177187274263'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/parallel-matlab.html' title='Parallel Matlab??'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2613446303884399636</id><published>2007-02-14T06:36:00.000-08:00</published><updated>2007-02-15T11:25:17.453-08:00</updated><title type='text'>Hacked Server revisited</title><content type='html'>I spent some time yesterday cleaning up the hacked server to see if I could at least get it back on-line for a few minutes to retrieve some files. Found many registry settings that were starting up sysres.exe and cssrs.exe. Deleted them, deleted all of the files etc. Rebooted with logging on to see what was loading. Ran active ports to see what ports were open. Then, with trepidation, connected it to the network and rebooted.&lt;br /&gt;As soon as it came up, I ran active ports (a port monitoring tool that shows you what tcp/ip connections are being made). Within 30 seconds, the server was trying to connect (on port 80) to a machine in Germany. Killed that process and waited. Another few seconds, and a connection from a local machine came in on port 139 (netbios). Killed that and waited. Nothing else. I then ran nmap from a remote machine to see what ports were still open. 123 was open. Didn't seem normal. Killed it and the machine rebooted, complaining that the rpc (Remote Procedure Call) server had stopped.  Oh well. Try again today.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2613446303884399636?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2613446303884399636/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2613446303884399636' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2613446303884399636'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2613446303884399636'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/hacked-server-revisited.html' title='Hacked Server revisited'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-5523815993167830087</id><published>2007-02-13T07:25:00.000-08:00</published><updated>2007-02-13T08:01:18.649-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='parallel computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun Grid Engine'/><category scheme='http://www.blogger.com/atom/ns#' term='monte carlo'/><category scheme='http://www.blogger.com/atom/ns#' term='simulation'/><category scheme='http://www.blogger.com/atom/ns#' term='matlab'/><title type='text'>Using Task Ids under Sun Grid Engine to manage large simulations</title><content type='html'>One of the nice features of &lt;a href="http://gridengine.sun.com"&gt;Sun Grid Engine&lt;/a&gt; is the ability to easily schedule a large number of related jobs, and manage them as one. Our researchers use this feature all the time in managing their jobs.&lt;br /&gt;&lt;br /&gt;It works like this:&lt;br /&gt;&lt;br /&gt;qsub -t 1-100 myjob.sh&lt;br /&gt;&lt;br /&gt;will submit  a job and run it 100 times, changing the value of an environment variable&lt;br /&gt;SGE_TASK_ID from 1 to 100&lt;br /&gt;&lt;br /&gt;The job is then  "parameterized" by this variable, and can base it's behavior on the value of SGE_TASK_ID.&lt;br /&gt;&lt;br /&gt;Most often, this variable can be used to set the "seed" of a random number generator to a different value for each run of a monte carlo simulation.  This allows large runs to be made "repeatable",  as well as guaranteeing that each run is independent of the others. The jobs are then automatically scheduled on any of the  execution nodes based on availability, software licenses etc.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.matlab.com"&gt;Matlab&lt;/a&gt;, &lt;a href="http://www.insightful.com"&gt;Splus&lt;/a&gt; and &lt;a href="http://www.sas.com"&gt;Sas&lt;/a&gt; all support the ability to retrieve tha value of an environment variable.&lt;br /&gt;&lt;br /&gt;In Matlab and Splus the "getenv" function will retrieve the value of an environment variable. In Sas I think it is "sysenv".&lt;br /&gt;&lt;br /&gt;For instance, in matlab you would do something like:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;StrSeed=getenv("SGE_TASK_ID");&lt;br /&gt;NumSeed= str2num(StrSeed);&lt;br /&gt;ranno=rand('state',NumSeed)&lt;br /&gt;&lt;br /&gt;....&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Each one of the 100 runs, would then use a different set of random number sequences, with the seed set by "SGE_TASK_ID".&lt;br /&gt;&lt;br /&gt;If for some reason some of the runs didn't complete. (A node died, for instance), the run is easily repeated.&lt;br /&gt;&lt;br /&gt;There are many applications of tasks in SGE. One common one is doing frame rendering of a large video file.  The file can be processed frame by frame, with each job handling a group of frames. This allows the job to be in parallel across many machines.&lt;br /&gt;&lt;br /&gt;Sun Grid Engine also provides a unique temporary directory for each job/task. It's path is placed in an environment variable called  "TMPDIR".&lt;br /&gt;&lt;br /&gt;Hence, a job script (named testsge.sh) might do something like:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;#! /bin/sh&lt;br /&gt;#$ -N jobname&lt;br /&gt;#$ -l h_cpu=10:00:00&lt;br /&gt;#$ -l matlab=1&lt;br /&gt;&lt;br /&gt;cd $TMPDIR&lt;br /&gt;cp home/myhome/myfolder/* .&lt;br /&gt;matlab &lt; mymatjob.m&lt;br /&gt;cp results.out /home/myhome/myfolder/results.out.$SGE_TASK_ID&lt;br /&gt;&lt;br /&gt;to submit the job, you just do a&lt;br /&gt;&lt;br /&gt;qsub -t 1-100 testjob.sh&lt;br /&gt;&lt;br /&gt;Sun Grid Engine will find an empty machine, track matlab license usage, and run all of the jobs..&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Standard output and error files will be placed in your home directory named&lt;br /&gt;&lt;br /&gt;jobname.oXXXX.yyyyy&lt;br /&gt;and&lt;br /&gt;jobname.eXXXX.yyyyy&lt;br /&gt;&lt;br /&gt;Where XXXX is the job number and yyyyy is the task number&lt;br /&gt;&lt;br /&gt;Your results file will be copied to your folder "myfolder" and named&lt;br /&gt;results.out.yyyyy&lt;br /&gt;&lt;br /&gt;Voila!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-5523815993167830087?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/5523815993167830087/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=5523815993167830087' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/5523815993167830087'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/5523815993167830087'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/using-task-ids-under-sun-grid-engine-to.html' title='Using Task Ids under Sun Grid Engine to manage large simulations'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-5979590605458156962</id><published>2007-02-12T09:45:00.000-08:00</published><updated>2007-02-12T09:49:38.501-08:00</updated><title type='text'>Intel announces 80 cores on a chip</title><content type='html'>This new &lt;a href="http://www.intel.com/"&gt;intel&lt;/a&gt; chip should speed things up. But, you never know. I was very excited about the new &lt;a href="http://www.sun.com/"&gt;Sun&lt;/a&gt; Sunfire T2000, which had dual 8-core processors, until I discovered that there was only one floating point unit per processor. Great for web serving, email etc, but not for HPC.&lt;br /&gt;&lt;br /&gt;&lt;a href="//http://www.nytimes.com/2007/02/12/technology/12chip.html?_r=1&amp;ref=business&amp;amp;oref=slogin"&gt;http://www.nytimes.com/2007/02/12/technology/12chip.html?_r=1&amp;ref=business&amp;amp;oref=slogin&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-5979590605458156962?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/5979590605458156962/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=5979590605458156962' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/5979590605458156962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/5979590605458156962'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/intel-announces-80-cores-on-chip.html' title='Intel announces 80 cores on a chip'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-887700111605804750</id><published>2007-02-12T09:06:00.000-08:00</published><updated>2007-02-12T07:42:10.644-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Sony PS3'/><category scheme='http://www.blogger.com/atom/ns#' term='cell processors'/><title type='text'>Cell Computing, Next Big Thing?</title><content type='html'>Pretty soon, high performance research computing may be done on Sony Playstation 3's. Not really, but certainly on the cell processors that are in the play station 3's.&lt;br /&gt;&lt;br /&gt;See this article from Los Alamos National Labs on the architecture of their next supercomputer.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://http://www.cs.utk.edu/%7Edongarra/cell2006/cell-slides/04-Ken-Koch.pdf"&gt;http://www.cs.utk.edu/~dongarra/cell2006/cell-slides/04-Ken-Koch.pdf &lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-887700111605804750?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/887700111605804750/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=887700111605804750' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/887700111605804750'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/887700111605804750'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/cell-computing-next-big-thing.html' title='Cell Computing, Next Big Thing?'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-3094876463601944072</id><published>2007-02-12T07:31:00.000-08:00</published><updated>2007-02-17T14:22:52.577-08:00</updated><title type='text'>Matlab problem under Sun Grid Engine</title><content type='html'>One of the nagging problems we continue to have is that &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;Matlab&lt;/span&gt; does not exit gracefully under Sun Grid Engine running on Linux.  This has been going on for several &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;Matlab&lt;/span&gt; releases. I had hoped it would be fixed under the latest release, but it seems not to have been. The problem is that the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Matlab&lt;/span&gt;_helper application, which &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;Matlab&lt;/span&gt; spawns never ends when the job is running on a &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4"&gt;linux&lt;/span&gt; node. Solaris exits nicely. Apparently, the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5"&gt;Matlab&lt;/span&gt; exit code doesn't know how to kill (or find) sub processes. The jobs never end and have to be killed by hand. Very annoying.&lt;br /&gt;&lt;br /&gt;The work around is to have &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6"&gt;Matlab&lt;/span&gt; call a shell file as it is &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7"&gt;exiting&lt;/span&gt;. The shell file (finish.sh in the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8"&gt;matlab&lt;/span&gt;/bin directory) contains&lt;br /&gt;a&lt;br /&gt;(sleep  5; &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9"&gt;killall&lt;/span&gt; -1 &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10"&gt;matlab&lt;/span&gt;_helper) &amp;amp&lt;br /&gt;exit 0&lt;br /&gt;&lt;br /&gt;This "usually" works...&lt;br /&gt;&lt;br /&gt;Pretty ugly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-3094876463601944072?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/3094876463601944072/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=3094876463601944072' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3094876463601944072'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/3094876463601944072'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/matlab-problem-under-sun-grid-engine.html' title='Matlab problem under Sun Grid Engine'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-9127151171786655139</id><published>2007-02-12T07:25:00.000-08:00</published><updated>2007-02-12T07:25:26.259-08:00</updated><title type='text'>Server hacked, what to do</title><content type='html'>First task today is to resurrect a Windows 2003 Server, which was hacked into last week. So much for Windows Update keeping you safe. Still not clear how this person got in, but I caught him/her at about 10am last Friday, connect to the system through Terminal Services. They had cracked one of the administrative passwords on another system, and had installed a worm that was rapidly scanning everything in site. Both the university network security people and I discovered him at about the same time. (Are there any female hackers??).&lt;br /&gt;&lt;br /&gt;Clearly the system needs to be rebuilt, but first there are the 70GBs of data that need to be saved.&lt;br /&gt;Ouch.,, Trickly if you don't want to reconnect the machine to the network. I may connect it to a closed subnet and move the data to another machine..  We'll see.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-9127151171786655139?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/9127151171786655139/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=9127151171786655139' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9127151171786655139'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/9127151171786655139'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/server-hacked-what-to-do.html' title='Server hacked, what to do'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8664028282644027940</id><published>2007-02-12T07:22:00.000-08:00</published><updated>2007-02-11T11:29:15.796-08:00</updated><title type='text'>Future of High Performance Computing</title><content type='html'>I just ran into this article. Good overview of where HPC is going and some of the challenges.&lt;br /&gt;http://www.ctwatch.org/quarterly/articles/2005/02/scientific-data-management/&lt;br /&gt;&lt;br /&gt;Note that &lt;a href="http://www.google.com"&gt;Google&lt;/a&gt; has had to solve a number of these problems already.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8664028282644027940?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8664028282644027940/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8664028282644027940' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8664028282644027940'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8664028282644027940'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/future-of-high-performance-computing.html' title='Future of High Performance Computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2970821917218809482</id><published>2007-02-11T09:35:00.000-08:00</published><updated>2007-02-12T07:24:13.059-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='High Performance Computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Super Computers'/><category scheme='http://www.blogger.com/atom/ns#' term='HPC'/><category scheme='http://www.blogger.com/atom/ns#' term='Storage arrays'/><category scheme='http://www.blogger.com/atom/ns#' term='cluster computing'/><category scheme='http://www.blogger.com/atom/ns#' term='statistical computing'/><title type='text'>Rapidly increasing capacity</title><content type='html'>By the Spring of 2006, it was clear that our usage was climbing dramatically, and we needed to do something. The &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;ISE&lt;/span&gt; project (quote by  quote options data)  in the &lt;a href="http://www.stern.nyu.edu/salomon"&gt;Salomon Center&lt;/a&gt; was quickly chewing up disk space at the rate of 20GB per day, and we now had researchers submitting 10,000 jobs at at time.  Soon, researchers were going to start studying all of the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;ISE&lt;/span&gt; data that was being collected.&lt;br /&gt;&lt;br /&gt;Over the summer, we added two new &lt;a href="http://www.wsm.com/"&gt;Western Scientific&lt;/a&gt; arrays, each with 6TB of storage. In addition, we added 3 &lt;a href="http://www.dell.com/"&gt;Dell&lt;/a&gt; &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Poweredge&lt;/span&gt; 1950 servers. We now had almost 20TB of storage and the equivalent of 46 processors with access to  almost 100GB of ram. Stern now has more processing power than&lt;a href="http://wrds.wharton.upenn.edu/"&gt; &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;WRDS&lt;/span&gt;&lt;/a&gt;, and &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4"&gt;WRDS&lt;/span&gt; is shared by more than 100 other schools. Our small cluster was now a small supercomputer, and we had more processing power than almost all other business schools, at least those that we know of. Please correct this if you have knowledge of other schools with this much power. We were in the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5"&gt;HPC&lt;/span&gt; arena (High Performance Computing). Some of our researchers were also using other facilities, such as the &lt;a href="http://www.nyu.edu/its/supercomputer"&gt;Supercomputer at NYU&lt;/a&gt; and the Terra Grid. Researchers were moving some of their computing back from Wharton, since we had considerably more processing power. A &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6"&gt;linux&lt;/span&gt; version of &lt;a href="http://www.sas.com/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7"&gt;Sas&lt;/span&gt;&lt;/a&gt; dramatically increased the amount of &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8"&gt;Sas&lt;/span&gt; processing power we had.&lt;br /&gt;&lt;br /&gt;Software packages available now included &lt;a href="http://www.insightful.com/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9"&gt;Splus&lt;/span&gt;&lt;/a&gt;, &lt;a href="http://www.matlab.com/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10"&gt;Matlab&lt;/span&gt;&lt;/a&gt;, &lt;a href="http://www.sas.com/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_11"&gt;Sas&lt;/span&gt;&lt;/a&gt;, &lt;a href="http://www.aptech.com/"&gt;Gauss&lt;/a&gt;, &lt;a href="http://www.stata.com/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12"&gt;Stata&lt;/span&gt;&lt;/a&gt;, R, Octave, &lt;a href="http://www.spss.com/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13"&gt;SPSS&lt;/span&gt;,&lt;/a&gt; &lt;a href="http://www.eviews.com/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_14"&gt;Eviews&lt;/span&gt;&lt;/a&gt;, as well as many unknown applications running on faculty and &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_15"&gt;Phd&lt;/span&gt; workstations. The group of researchers who were using our facilities (i.e. the Grid, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_16"&gt;WRDS&lt;/span&gt;, the behavioral lab) now number well above 100 hundred.&lt;br /&gt;&lt;br /&gt;Next I will talk about our most recent plans, and then I will go back and describe how we do some of the things we do in more detail.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2970821917218809482?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2970821917218809482/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2970821917218809482' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2970821917218809482'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2970821917218809482'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/rapidly-increasing-capacity.html' title='Rapidly increasing capacity'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-8393372493215618384</id><published>2007-02-11T09:11:00.000-08:00</published><updated>2007-02-11T11:28:04.589-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='parallel computing'/><category scheme='http://www.blogger.com/atom/ns#' term='grid computing'/><category scheme='http://www.blogger.com/atom/ns#' term='backups'/><title type='text'>Establishment of the Stern Center for Research Computing</title><content type='html'>In the fall of  2004, I was approached by the Deans at Stern to see if I would be willing to serve as Faculty Director of the &lt;a href="http://www.stern.nyu.edu/scrc"&gt;Stern Center for Research Computing&lt;/a&gt;, a role similar to the Faculty Director of the CITL ( &lt;a href="http://www.stern.nyu.edu/citl"&gt;Center for Innovation in Teaching in Learning&lt;/a&gt;) at Stern. I would essentially be responsible  for all school activities related to the use of computers for research. This included our usage of WRDS, all of our data contracts, our internal Sun system, the Beowulf cluster I was running, faculty usage of PCs for research computing, software licenses, the Stern &lt;a href="http://www.stern.nyu.edu/behaviorlab"&gt;Behavioral Lab&lt;/a&gt; and other odds and ends.&lt;br /&gt;&lt;br /&gt;I agreed, and the Center was started. We made rapid progress in quickly expanding our capabilities. We discovered a quad processor Dell Poweredge server  that wasn't being used, and added that to the cluster. We replaced the aging Sun server with a newer Sun V240 to provide the front-end to the cluster, as well as server as a license server for much of our software. We moved our scheduling software to &lt;a href="http://gridengine.sun.com/"&gt;Sun Grid Engine&lt;/a&gt; from PBS. Two old Sun A1000 storage arrays were reclaimed to provide increased disk space for researchers. With the help of the &lt;a href="http://www.stern.nyu.edu/salomon"&gt;Salomon Center&lt;/a&gt; at Stern, we consolidated research computing into 3 new racks in the corner of the Stern Copy center. A &lt;a href="http://www.dell.com/"&gt;Dell&lt;/a&gt; Powervault tape backup unit provided independent backup facilities, using the Amanda tape backup system.  Several faculty research projects agreed to move their equipment into our racks, in return for the Center being able to schedule jobs on their machines. By the fall of 2005,  we were in full production with 20 times the processing power of the previous  year, and a growing list of users. Researchers were becoming more familiar with how to "parallelize" their tasks to take advantage of the growing number of processors.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-8393372493215618384?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/8393372493215618384/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=8393372493215618384' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8393372493215618384'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/8393372493215618384'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/establishment-of-stern-center-for.html' title='Establishment of the Stern Center for Research Computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-185386811036412314</id><published>2007-02-11T08:53:00.000-08:00</published><updated>2007-02-11T09:10:03.990-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='monte carlo'/><category scheme='http://www.blogger.com/atom/ns#' term='simulation'/><title type='text'>Precursor to the Stern Center for Research Computing</title><content type='html'>The Stern Center for Research Computing started as a pilot project with a grant from &lt;a href="http://www.citi.com/"&gt;Citigroup  &lt;/a&gt;which funded the acquisition of a small cluster of Linux machines (6 1.4 ghz &lt;a href="http://www.amd.com"&gt;AMD&lt;/a&gt; dual processor Athlons with 768Mb of ram), from&lt;a href="http://www.pssclabs.com/"&gt; PSSC labs &lt;/a&gt;in California. The cluster used PBS (Portable Batch System) for scheduling. I had sought the grant because I thought that the primary research computing facilties did not provide enough horsepower for real number crunching. At the time, researchers were using an old quad processor &lt;a href="http://www.sun.com"&gt;Sun&lt;/a&gt; machine for internal use, and had access (as do many other business schools) to the WRDS &lt;a href="http://wrds.wharton.upenn.edu/"&gt;(Wharton Research Data Systems) &lt;/a&gt;&lt;br /&gt;at the University of Pennsylania.&lt;br /&gt;&lt;br /&gt;As they say, build it and they will come. Within a few weeks of getting the equipment up and running, researchers (primarily Phds) had discovered that there was now a major new computing capability available. Programs which used to take weeks to run, could now be done in days.  Most applications were large monte carlo simulations using &lt;a href="http://www.insightful.com/"&gt;Splus&lt;/a&gt; or &lt;a href="http://www.matlab.com/"&gt;Matlab. &lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Our Sas users and other users continued to use our old timesharing system and WRDS.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-185386811036412314?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/185386811036412314/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=185386811036412314' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/185386811036412314'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/185386811036412314'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/precursor-to-stern-center-for-research.html' title='Precursor to the Stern Center for Research Computing'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6614598184406322281.post-2431720914207715926</id><published>2007-02-11T08:36:00.000-08:00</published><updated>2007-02-11T08:51:35.828-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='research computing'/><category scheme='http://www.blogger.com/atom/ns#' term='startup'/><category scheme='http://www.blogger.com/atom/ns#' term='business school'/><title type='text'>Welcome to the Research Computing Blog</title><content type='html'>This is the first post, so I am just getting used to the tools. I hope to make it more sophisticated as I become more experienced.&lt;br /&gt;&lt;br /&gt;Why this blog? Well, as Faculty Director of &lt;a href="http://www.stern.nyu.edu/scrc"&gt;the Stern Center for Research Computing&lt;/a&gt; at the &lt;a href="http://www.stern.nyu.edu/"&gt;Stern School of Business at New York University&lt;/a&gt;, I spend a fair amount of my time thinking about research computing.  I decided to start a blog as way to keep track of our activities, as well as share ideas and information about research computing, especially in academia, and more especially in business schools.  Every day we are presented with research problems where faculty or Phds are trying to solve a particular problem and need a tool or approach or some resources to help them. This blog is meant to help organize some of this information so it might be useful to other schools, and so we might start sharing best practices.&lt;br /&gt;&lt;br /&gt;The early posts will be used to provide some background in research computing at Stern, describing how we got to where we are, while the later posts will be a chronicle of daily and weekly events. I hope to provide feedback on a variety of technologies, services and products that we use, so that researchers from other institutions can provide feedback. Stay tuned!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6614598184406322281-2431720914207715926?l=researchcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://researchcomputing.blogspot.com/feeds/2431720914207715926/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6614598184406322281&amp;postID=2431720914207715926' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2431720914207715926'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6614598184406322281/posts/default/2431720914207715926'/><link rel='alternate' type='text/html' href='http://researchcomputing.blogspot.com/2007/02/welcome-to-research-computing-blog.html' title='Welcome to the Research Computing Blog'/><author><name>Norman White</name><uri>http://www.blogger.com/profile/07953663379222392182</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www.stern.nyu.edu/~nwhite/nwhite.gif'/></author><thr:total>0</thr:total></entry></feed>
