Sponsored It is easy to spend a lot of time thinking about the compute and interconnect in any kind of high performance computing workload – and hard not to spend just as much time thinking about the storage supporting that workload. It is particularly important to think about the type and volume of the data that will feed into these applications because this, more than any other factor, will determine the success or failure of that workload in meeting the needs of the organization.
It is in vogue these days to have a “cloud first” mentality when it comes to IT infrastructure, but what organizations really need is a “data first” attitude and then realize that cloud is just a deployment model with a pricing scheme and – perhaps – a deeper pool of resources than many organizations are accustomed to. But those deep pools come at a cost. It is fairly cheap to move data into clouds or generate it there and keep it there; however, it can be exorbitantly expensive to move data from a cloud so it can be used elsewhere.
The new classes of HPC applications, such as machine learning training and data analytics running at scale, tend to feed on or create large datasets, so it is important to have this data first attitude as the system is being architected. The one thing you don’t want to do is find out somewhere between proof of concept and production that you have the wrong storage – or worse yet, find out that your storage can’t keep up with the data as a new workload rolls into production and is a wild success.
source: The Register