{"id":3677,"date":"2019-03-25T17:45:27","date_gmt":"2019-03-25T20:45:27","guid":{"rendered":"http:\/\/xexeu.elipse.com.br\/pt\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/"},"modified":"2019-10-31T14:01:00","modified_gmt":"2019-10-31T17:01:00","slug":"removendo-dados-discrepantes-outliers-com-a-linguagem-python","status":"publish","type":"post","link":"https:\/\/kb.elipse.com.br\/en\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/","title":{"rendered":"Removendo dados discrepantes (outliers) com a linguagem Python."},"content":{"rendered":"<p align=\"justify\"><b>Python <\/b>\u00e9 a linguagem de scripts utilizada pelo EPM. Para maiores informa\u00e7\u00f5es, clique <a href=\"https:\/\/www.elipse.com.br\/produto\/elipse-plant-manager\/\">aqui<\/a>.<\/p>\n<p align=\"justify\">Muitas vezes, uma an\u00e1lise estat\u00edstica n\u00e3o pode ser validada devido \u00e0 exist\u00eancia de outliers, ou seja, valores inconsistentes com os demais do conjunto. Neste estudo de caso, vamos considerar que houve problemas em um sensor de temperatura ambiente, o que causou leituras erradas. Observe o conjunto de dados:<\/p>\n<p align=\"justify\"><span style=\"font-family: Courier New\">Temperaturas = [ 25, 26, <span style=\"color: red\">225<\/span>, 24, 23, 24, 25, <span style=\"color: red\">325<\/span>, 28, 27]<\/span><\/p>\n<p align=\"justify\">Fica claro que existem valores incompat\u00edveis. Veja o que acontece se calcularmos a m\u00e9dia entre estes valores:<\/p>\n<p align=\"justify\">M\u00e9dia = (25+26+225+24+23+24+25+325+28+27) \/ 10, ou seja = 75.2<\/p>\n<p align=\"justify\">Obviamente a m\u00e9dia est\u00e1 fora da normalidade, e isso causaria erros na interpreta\u00e7\u00e3o dos dados. Observe agora a m\u00e9dia sem os outliers.<\/p>\n<p align=\"justify\">M\u00e9dia = (25+26+24+23+24+25+28+27)\/8, ou seja = 25.25<\/p>\n<p align=\"justify\">A fun\u00e7\u00e3o <i><span>removeoutlier\u00a0 <\/span><\/i><span>utiliza<\/span><i> <\/i>o m\u00e9todo de John Tukey\u00a0 (John Tukey, Exploratory Data Analysis, Addison-Wesley, 1977, pp. 43-44 ).<\/p>\n<p align=\"justify\">Considere:<\/p>\n<p align=\"justify\"><b>q1<\/b> como primeiro quartil do conjunto de valores.<\/p>\n<p align=\"justify\"><b>q3 <\/b>como terceiro quartil do conjunto de valores.<\/p>\n<p align=\"justify\">Os outliers ser\u00e3o os valores do conjunto que est\u00e3o abaixo de q1 &#8211; 1.5(q3-q1) e acima de q3 + 1.5(q3-q1).<\/p>\n<p><span style=\"font-family: Courier New\"><span style=\"color: #ff9900\">import<\/span> numpy <span style=\"color: #ff9900\">as<\/span> np<\/span><\/p>\n<p><span style=\"font-family: Courier New\"><span style=\"color: #ff9900\">def<\/span> <span style=\"color: #ff6600\">removeoutlier<\/span>(<span style=\"color: #3366ff\">values<\/span>):<\/span><\/p>\n<p><span style=\"font-family: Courier New\"><span>\u00a0\u00a0\u00a0 <\/span>fator <span style=\"color: #ff9900\">=<\/span> <span style=\"color: #993300\">1.5<\/span><\/span><\/p>\n<p><span style=\"font-family: Courier New\"><span>\u00a0\u00a0\u00a0 <\/span>q3, q1 <span style=\"color: #ff9900\">=<\/span> np.percentile(<span style=\"color: #3366ff\">values<\/span>, [<span style=\"color: #993300\">75<\/span>, <span style=\"color: #993300\">25<\/span>])<\/span><\/p>\n<p><span style=\"font-family: Courier New\"><span>\u00a0\u00a0\u00a0 <\/span>iqr <span style=\"color: #ff9900\">=<\/span> q3 <span style=\"color: #ff9900\">\u2013<\/span> q1<\/span><\/p>\n<p><span style=\"font-family: Courier New\"><span>\u00a0\u00a0\u00a0 <\/span>lowpass <span style=\"color: #ff9900\">=<\/span> q1 <span style=\"color: #ff9900\">&#8211;<\/span> (iqr <span style=\"color: #ff9900\">*<\/span> fator)<\/span><\/p>\n<p><span style=\"font-family: Courier New\"><span>\u00a0\u00a0\u00a0 <\/span>highpass <span style=\"color: #ff9900\">=<\/span> q3 <span style=\"color: #ff9900\">+<\/span> (iqr <span style=\"color: #ff9900\">*<\/span> fator)<\/span><\/p>\n<p><span style=\"font-family: Courier New\"><span>\u00a0\u00a0<\/span><\/span><span style=\"font-family: Courier New\"><span>\u00a0 <\/span><span style=\"color: #ff9900\">return<\/span> [v for v in values if v > lowpass and v < highpass] <\/span><\/p>\n<p align=\"justify\">Anexo a este artigo est\u00e1 uma vers\u00e3o comentada desse script.<\/p>\n<p align=\"justify\"><b>NOTA<\/b>: Nem sempre outliers representam erros de leitura. Por exemplo, uma temperatura extrema poderia ser causada por condi\u00e7\u00f5es clim\u00e1ticas adversas. \u00c9 necess\u00e1rio avaliar o conjunto de dados, o contexto em que foram gerados e quais informa\u00e7\u00f5es se deseja extrair deles, para saber qual a melhor t\u00e9cnica para remo\u00e7\u00e3o de outliers.<\/p>\n<h3 align=\"justify\">Anexos:<\/h3>\n<p><a href=\"\/wp-content\/uploads\/2019\/03\/removeoutliers.zip\">removeoutliers.zip<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Python \u00e9 a linguagem de scripts utilizada pelo EPM. Para maiores informa\u00e7\u00f5es, clique aqui. Muitas vezes, uma an\u00e1lise estat\u00edstica n\u00e3o pode ser validada devido \u00e0 exist\u00eancia de outliers, ou seja,&hellip;<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0},"categories":[676,714],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v19.8 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Removendo dados discrepantes (outliers) com a linguagem Python. - Elipse Knowledgebase<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Removendo dados discrepantes (outliers) com a linguagem Python.\" \/>\n<meta property=\"og:description\" content=\"Python \u00e9 a linguagem de scripts utilizada pelo EPM. Para maiores informa\u00e7\u00f5es, clique aqui. Muitas vezes, uma an\u00e1lise estat\u00edstica n\u00e3o pode ser validada devido \u00e0 exist\u00eancia de outliers, ou seja,&hellip;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/\" \/>\n<meta property=\"og:site_name\" content=\"Elipse Knowledgebase\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/elipsesoftware\" \/>\n<meta property=\"article:published_time\" content=\"2019-03-25T20:45:27+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-10-31T17:01:00+00:00\" \/>\n<meta name=\"author\" content=\"Lucas Kotres\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Lucas Kotres\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/\"},\"author\":{\"name\":\"Lucas Kotres\",\"@id\":\"https:\/\/kb.elipse.com.br\/#\/schema\/person\/e57d707c58cf7aa3231eb5fdcc2ec379\"},\"headline\":\"Removendo dados discrepantes (outliers) com a linguagem Python.\",\"datePublished\":\"2019-03-25T20:45:27+00:00\",\"dateModified\":\"2019-10-31T17:01:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/\"},\"wordCount\":295,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/kb.elipse.com.br\/#organization\"},\"articleSection\":[\"Elipse Plant Manager\",\"Scripts\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/\",\"url\":\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/\",\"name\":\"[:pt]Removendo dados discrepantes (outliers) com a linguagem Python.[:] - Elipse Knowledgebase\",\"isPartOf\":{\"@id\":\"https:\/\/kb.elipse.com.br\/#website\"},\"datePublished\":\"2019-03-25T20:45:27+00:00\",\"dateModified\":\"2019-10-31T17:01:00+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"In\u00edcio\",\"item\":\"https:\/\/kb.elipse.com.br\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Removendo dados discrepantes (outliers) com a linguagem Python.\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/kb.elipse.com.br\/#website\",\"url\":\"https:\/\/kb.elipse.com.br\/\",\"name\":\"Elipse Knowledgebase\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/kb.elipse.com.br\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/kb.elipse.com.br\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/kb.elipse.com.br\/#organization\",\"name\":\"Elipse Software\",\"url\":\"https:\/\/kb.elipse.com.br\/\",\"sameAs\":[\"http:\/\/www.facebook.com\/elipsesoftware\"],\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/kb.elipse.com.br\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/kb.elipse.com.br\/wp-content\/uploads\/2019\/05\/schererelipse-com-br\/logoElipse.png\",\"contentUrl\":\"https:\/\/kb.elipse.com.br\/wp-content\/uploads\/2019\/05\/schererelipse-com-br\/logoElipse.png\",\"width\":161,\"height\":58,\"caption\":\"Elipse Software\"},\"image\":{\"@id\":\"https:\/\/kb.elipse.com.br\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/kb.elipse.com.br\/#\/schema\/person\/e57d707c58cf7aa3231eb5fdcc2ec379\",\"name\":\"Lucas Kotres\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/kb.elipse.com.br\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/faa3c53fb2e8ab0c35c4b8c90a691931?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/faa3c53fb2e8ab0c35c4b8c90a691931?s=96&d=mm&r=g\",\"caption\":\"Lucas Kotres\"},\"url\":\"https:\/\/kb.elipse.com.br\/en\/author\/kotres\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Removendo dados discrepantes (outliers) com a linguagem Python. - Elipse Knowledgebase","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/","og_locale":"en_US","og_type":"article","og_title":"[:pt]Removendo dados discrepantes (outliers) com a linguagem Python.[:] - Elipse Knowledgebase","og_description":"Python \u00e9 a linguagem de scripts utilizada pelo EPM. Para maiores informa\u00e7\u00f5es, clique aqui. Muitas vezes, uma an\u00e1lise estat\u00edstica n\u00e3o pode ser validada devido \u00e0 exist\u00eancia de outliers, ou seja,&hellip;","og_url":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/","og_site_name":"Elipse Knowledgebase","article_publisher":"http:\/\/www.facebook.com\/elipsesoftware","article_published_time":"2019-03-25T20:45:27+00:00","article_modified_time":"2019-10-31T17:01:00+00:00","author":"Lucas Kotres","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Lucas Kotres","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/#article","isPartOf":{"@id":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/"},"author":{"name":"Lucas Kotres","@id":"https:\/\/kb.elipse.com.br\/#\/schema\/person\/e57d707c58cf7aa3231eb5fdcc2ec379"},"headline":"Removendo dados discrepantes (outliers) com a linguagem Python.","datePublished":"2019-03-25T20:45:27+00:00","dateModified":"2019-10-31T17:01:00+00:00","mainEntityOfPage":{"@id":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/"},"wordCount":295,"commentCount":0,"publisher":{"@id":"https:\/\/kb.elipse.com.br\/#organization"},"articleSection":["Elipse Plant Manager","Scripts"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/","url":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/","name":"[:pt]Removendo dados discrepantes (outliers) com a linguagem Python.[:] - Elipse Knowledgebase","isPartOf":{"@id":"https:\/\/kb.elipse.com.br\/#website"},"datePublished":"2019-03-25T20:45:27+00:00","dateModified":"2019-10-31T17:01:00+00:00","breadcrumb":{"@id":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/kb.elipse.com.br\/removendo-dados-discrepantes-outliers-com-a-linguagem-python\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"In\u00edcio","item":"https:\/\/kb.elipse.com.br\/en\/"},{"@type":"ListItem","position":2,"name":"Removendo dados discrepantes (outliers) com a linguagem Python."}]},{"@type":"WebSite","@id":"https:\/\/kb.elipse.com.br\/#website","url":"https:\/\/kb.elipse.com.br\/","name":"Elipse Knowledgebase","description":"","publisher":{"@id":"https:\/\/kb.elipse.com.br\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/kb.elipse.com.br\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/kb.elipse.com.br\/#organization","name":"Elipse Software","url":"https:\/\/kb.elipse.com.br\/","sameAs":["http:\/\/www.facebook.com\/elipsesoftware"],"logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/kb.elipse.com.br\/#\/schema\/logo\/image\/","url":"https:\/\/kb.elipse.com.br\/wp-content\/uploads\/2019\/05\/schererelipse-com-br\/logoElipse.png","contentUrl":"https:\/\/kb.elipse.com.br\/wp-content\/uploads\/2019\/05\/schererelipse-com-br\/logoElipse.png","width":161,"height":58,"caption":"Elipse Software"},"image":{"@id":"https:\/\/kb.elipse.com.br\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/kb.elipse.com.br\/#\/schema\/person\/e57d707c58cf7aa3231eb5fdcc2ec379","name":"Lucas Kotres","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/kb.elipse.com.br\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/faa3c53fb2e8ab0c35c4b8c90a691931?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/faa3c53fb2e8ab0c35c4b8c90a691931?s=96&d=mm&r=g","caption":"Lucas Kotres"},"url":"https:\/\/kb.elipse.com.br\/en\/author\/kotres\/"}]}},"_links":{"self":[{"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/posts\/3677"}],"collection":[{"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/comments?post=3677"}],"version-history":[{"count":5,"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/posts\/3677\/revisions"}],"predecessor-version":[{"id":9259,"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/posts\/3677\/revisions\/9259"}],"wp:attachment":[{"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/media?parent=3677"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/categories?post=3677"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kb.elipse.com.br\/en\/wp-json\/wp\/v2\/tags?post=3677"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}