{"id":13955,"date":"2018-09-12T12:02:42","date_gmt":"2018-09-12T11:02:42","guid":{"rendered":"http:\/\/www.devopsonline.co.uk\/?p=13955"},"modified":"2018-09-12T12:02:42","modified_gmt":"2018-09-12T11:02:42","slug":"the-importance-of-realistic-data-in-tests","status":"publish","type":"post","link":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/","title":{"rendered":"The importance of realistic data in tests"},"content":{"rendered":"

There are many tests that we can execute in\u00a0<\/span>Business Intelligence<\/span><\/a>\u00a0(BI) systems or any system that uses uncontrolled data\u00a0\u2013\u00a0<\/span>extract, transform, load<\/span><\/a>\u00a0(ETL), queries, performance etc.<\/span><\/p>\n

Testing should be done as close as possible to the conditions that it will be used in production by the actual users. One of the keys to success in this area is the data that is being used during the testing process.<\/span><\/p>\n

Some applications use only data produced by them, like alarm clocks. Other apps use only predefined data like weather apps. Those cases are relatively easy. But when your application or system uses a lot of types of data, including external varied data, and sometimes unstructured data\u00a0\u2013 like systems with big data, the data might be\u00a0corrupted\u00a0or\u00a0unexpected\u00a0(IOW, the code can\u2019t handle it). But it is not only unexpected data that might cause data integrity issues. It can also be malfunctioning of the\u00a0data processing\u00a0of items that are supposed to be handled. For example, inability to process a certain type of picture format or a variation of it.<\/span><\/p>\n

Other kinds of malfunctioning are the BI systems ability to\u00a0filter\u00a0or\u00a0query\u00a0correctly. For example, querying for items up to 2k and getting also items of 2.1k. You can think of such a test, but you can’t think of all the tests. Doing the tests but also using a large amount of data will increase the chance\u00a0of finding more issues.<\/span><\/p>\n

The number of possibilities is unlimited, and thus the number of data types and tests. this has many risks, from data loss or data that will not be processed correctly, to system downtime.<\/span><\/p>\n

Regarding the last point, it is also the programmer’s responsibility to handle unexpected data in the code.<\/span><\/p>\n

Reducing risks<\/span><\/h2>\n

Always use as much data as you can in your tests.\u00a0<\/span>Fill the database with similar types of items you are testing besides all other supported and unsupported data. For example, if you test emails, make sure you have a lot of emails in the system. Some in different languages, different lengths, w\/ and w\/o attachments, different attachments size etc. If you don\u2019t have enough data, try to develop test code that can produce a large amount of data which you can control its content.<\/span><\/p>\n

Otherwise, when the test says: use a filter X, make sure you see item A which is a document with the title \u201cI am a document\u201d, and the only document in the database is the item A, you might miss a bug that will retrieve all documents that starts with \u201cI\u201d or other types of data that somehow will enter the result (picture by the name \u201cI am a document\u201d for example).<\/span><\/p>\n

Customer data<\/span><\/h2>\n

Using real data from the customer or production is not a nice to have but a very important factor in the success of the tests and can discover all kinds of issues from above. True, it is not always available, but because of its importance, we must do our best to get it.<\/span><\/p>\n

Feel free to abuse<\/span><\/h2>\n

Feel free to abuse the data. From corrupted data packets to pictures. Cut the data, add unexpected data, long, short etc. Fill the database with unsupported data.<\/span><\/p>\n

Know your customer<\/span><\/h2>\n

If you have a specific customer, research them. If he\/she is in a specific country and the data is coming from the web, research what are the most popular website in that country, Which are the most popular languages. If it is related to apps or phone types, research what are the most common apps in the country, common phones etc. and base your tests on that.<\/span><\/p>\n

The last tip about the data is that ETL and executing queries are not enough to validate that all is working. You need to make sure the data went all over the system up to the export correctly.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"

There are many tests that we can execute in\u00a0Business Intelligence\u00a0(BI) systems or any system that uses uncontrolled data\u00a0\u2013\u00a0extract, transform, load\u00a0(ETL), queries, performance etc<\/p>\n","protected":false},"author":2,"featured_media":13956,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"content-type":"","pmpro_default_level":"","footnotes":""},"categories":[2],"tags":[862,284,691,3063,448,3197,395,3196],"yoast_head":"\nThe importance of realistic data in tests<\/title>\n<meta name=\"description\" content=\"There are many tests that we can execute in\u00a0Business Intelligence\u00a0(BI) systems or any system that uses uncontrolled data\u00a0\u2013\u00a0extract, transform, load\u00a0(ETL), queries, performance etc.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"The importance of realistic data in tests\" \/>\n<meta property=\"og:description\" content=\"There are many tests that we can execute in\u00a0Business Intelligence\u00a0(BI) systems or any system that uses uncontrolled data\u00a0\u2013\u00a0extract, transform, load\u00a0(ETL), queries, performance etc.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/\" \/>\n<meta property=\"og:site_name\" content=\"DevOps Online North America\" \/>\n<meta property=\"article:published_time\" content=\"2018-09-12T11:02:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1280\" \/>\n\t<meta property=\"og:image:height\" content=\"853\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DevOps Online\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DevOpsAmerica\" \/>\n<meta name=\"twitter:site\" content=\"@DevOpsAmerica\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DevOps Online\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/\"},\"author\":{\"name\":\"DevOps Online\",\"@id\":\"https:\/\/devopsnews.online\/#\/schema\/person\/de52473fff111f14d90763193184cb1e\"},\"headline\":\"The importance of realistic data in tests\",\"datePublished\":\"2018-09-12T11:02:42+00:00\",\"dateModified\":\"2018-09-12T11:02:42+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/\"},\"wordCount\":655,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/devopsnews.online\/#organization\"},\"image\":{\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg\",\"keywords\":[\"BI\",\"Big Data\",\"business intelligence\",\"customer data\",\"data\",\"ETL\",\"QA\",\"Verint-Systems\"],\"articleSection\":[\"Featured\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/\",\"url\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/\",\"name\":\"The importance of realistic data in tests\",\"isPartOf\":{\"@id\":\"https:\/\/devopsnews.online\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg\",\"datePublished\":\"2018-09-12T11:02:42+00:00\",\"dateModified\":\"2018-09-12T11:02:42+00:00\",\"description\":\"There are many tests that we can execute in\u00a0Business Intelligence\u00a0(BI) systems or any system that uses uncontrolled data\u00a0\u2013\u00a0extract, transform, load\u00a0(ETL), queries, performance etc.\",\"breadcrumb\":{\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#primaryimage\",\"url\":\"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg\",\"contentUrl\":\"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg\",\"width\":1280,\"height\":853,\"caption\":\"data tests\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/devopsnews.online\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"The importance of realistic data in tests\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/devopsnews.online\/#website\",\"url\":\"https:\/\/devopsnews.online\/\",\"name\":\"DevOps Online North America\",\"description\":\"by 31 Media Ltd.\",\"publisher\":{\"@id\":\"https:\/\/devopsnews.online\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/devopsnews.online\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/devopsnews.online\/#organization\",\"name\":\"DevOps Online North America\",\"url\":\"https:\/\/devopsnews.online\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/devopsnews.online\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/devopsnews.online\/wp-content\/uploads\/2020\/03\/DevOpsOnline_email.png\",\"contentUrl\":\"https:\/\/devopsnews.online\/wp-content\/uploads\/2020\/03\/DevOpsOnline_email.png\",\"width\":198,\"height\":64,\"caption\":\"DevOps Online North America\"},\"image\":{\"@id\":\"https:\/\/devopsnews.online\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/DevOpsAmerica\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/devopsnews.online\/#\/schema\/person\/de52473fff111f14d90763193184cb1e\",\"name\":\"DevOps Online\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/devopsnews.online\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/cf0ba37fb1f8baf226b40986afbe7f9f?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/cf0ba37fb1f8baf226b40986afbe7f9f?s=96&d=mm&r=g\",\"caption\":\"DevOps Online\"},\"url\":\"https:\/\/devopsnews.online\/author\/test-magazine\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"The importance of realistic data in tests","description":"There are many tests that we can execute in\u00a0Business Intelligence\u00a0(BI) systems or any system that uses uncontrolled data\u00a0\u2013\u00a0extract, transform, load\u00a0(ETL), queries, performance etc.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/","og_locale":"en_US","og_type":"article","og_title":"The importance of realistic data in tests","og_description":"There are many tests that we can execute in\u00a0Business Intelligence\u00a0(BI) systems or any system that uses uncontrolled data\u00a0\u2013\u00a0extract, transform, load\u00a0(ETL), queries, performance etc.","og_url":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/","og_site_name":"DevOps Online North America","article_published_time":"2018-09-12T11:02:42+00:00","og_image":[{"width":1280,"height":853,"url":"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg","type":"image\/jpeg"}],"author":"DevOps Online","twitter_card":"summary_large_image","twitter_creator":"@DevOpsAmerica","twitter_site":"@DevOpsAmerica","twitter_misc":{"Written by":"DevOps Online","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#article","isPartOf":{"@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/"},"author":{"name":"DevOps Online","@id":"https:\/\/devopsnews.online\/#\/schema\/person\/de52473fff111f14d90763193184cb1e"},"headline":"The importance of realistic data in tests","datePublished":"2018-09-12T11:02:42+00:00","dateModified":"2018-09-12T11:02:42+00:00","mainEntityOfPage":{"@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/"},"wordCount":655,"commentCount":0,"publisher":{"@id":"https:\/\/devopsnews.online\/#organization"},"image":{"@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#primaryimage"},"thumbnailUrl":"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg","keywords":["BI","Big Data","business intelligence","customer data","data","ETL","QA","Verint-Systems"],"articleSection":["Featured"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/","url":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/","name":"The importance of realistic data in tests","isPartOf":{"@id":"https:\/\/devopsnews.online\/#website"},"primaryImageOfPage":{"@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#primaryimage"},"image":{"@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#primaryimage"},"thumbnailUrl":"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg","datePublished":"2018-09-12T11:02:42+00:00","dateModified":"2018-09-12T11:02:42+00:00","description":"There are many tests that we can execute in\u00a0Business Intelligence\u00a0(BI) systems or any system that uses uncontrolled data\u00a0\u2013\u00a0extract, transform, load\u00a0(ETL), queries, performance etc.","breadcrumb":{"@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#primaryimage","url":"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg","contentUrl":"https:\/\/devopsnews.online\/wp-content\/uploads\/2018\/09\/data_1507816117.jpg","width":1280,"height":853,"caption":"data tests"},{"@type":"BreadcrumbList","@id":"https:\/\/devopsnews.online\/the-importance-of-realistic-data-in-tests\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/devopsnews.online\/"},{"@type":"ListItem","position":2,"name":"The importance of realistic data in tests"}]},{"@type":"WebSite","@id":"https:\/\/devopsnews.online\/#website","url":"https:\/\/devopsnews.online\/","name":"DevOps Online North America","description":"by 31 Media Ltd.","publisher":{"@id":"https:\/\/devopsnews.online\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/devopsnews.online\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/devopsnews.online\/#organization","name":"DevOps Online North America","url":"https:\/\/devopsnews.online\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/devopsnews.online\/#\/schema\/logo\/image\/","url":"https:\/\/devopsnews.online\/wp-content\/uploads\/2020\/03\/DevOpsOnline_email.png","contentUrl":"https:\/\/devopsnews.online\/wp-content\/uploads\/2020\/03\/DevOpsOnline_email.png","width":198,"height":64,"caption":"DevOps Online North America"},"image":{"@id":"https:\/\/devopsnews.online\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/DevOpsAmerica"]},{"@type":"Person","@id":"https:\/\/devopsnews.online\/#\/schema\/person\/de52473fff111f14d90763193184cb1e","name":"DevOps Online","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/devopsnews.online\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/cf0ba37fb1f8baf226b40986afbe7f9f?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/cf0ba37fb1f8baf226b40986afbe7f9f?s=96&d=mm&r=g","caption":"DevOps Online"},"url":"https:\/\/devopsnews.online\/author\/test-magazine\/"}]}},"_links":{"self":[{"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/posts\/13955"}],"collection":[{"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/comments?post=13955"}],"version-history":[{"count":0,"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/posts\/13955\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/media\/13956"}],"wp:attachment":[{"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/media?parent=13955"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/categories?post=13955"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devopsnews.online\/wp-json\/wp\/v2\/tags?post=13955"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}