Autor: Berman, Jules J.
Wydawca: Elsevier
Dostępność: 3-6 tygodni
Cena: 296,10 zł
Przed złożeniem zamówienia prosimy o kontakt mailowy celem potwierdzenia ceny.
ISBN13: |
9780128037812 |
Autor: |
Berman, Jules J. |
Oprawa: |
Paperback |
Rok Wydania: |
2016-03-09 |
Tematy: |
UB |
Data Simplification: Taming Information With Open Source Tools addresses the simple fact that modern data is too big and complex to analyze in its native form. Data simplification is the process whereby large and complex data is rendered usable. Complex data must be simplified before it can be analyzed, but the process of data simplification is anything but simple, requiring a specialized set of skills and tools.
This book provides data scientists from every scientific discipline with the methods and tools to simplify their data for immediate analysis or long-term storage in a form that can be readily repurposed or integrated with other data.
Drawing upon years of practical experience, and using numerous examples and use cases, Jules Berman discusses the principles, methods, and tools that must be studied and mastered to achieve data simplification, open source tools, free utilities and snippets of code that can be reused and repurposed to simplify data, natural language processing and machine translation as a tool to simplify data, and data summarization and visualization and the role they play in making data useful for the end user.
Chapter 1. The Simple Life
Open Source Tools for Chapter 1
Perl
Python
Ruby
Text Editors
OpenOffice
Command line utilities
Cygwin, Linux emulation for Windows
DOS batch scripts
Linux bash scripts
Interactive line interpreters
Package installers
System calls
References for Chapter 1
Glossary for Chapter 1
Chapter 2. Structuring Text
Open Source Tools for Chapter 2
ASCII
Regular expressions
Format commands
Converting non-printable files to plain-text
Dublin Core
References for Chapter 2
Glossary for Chapter 2
Chapter 3. Indexing Text
Open Source Tools for Chapter 3
Word lists
Doublet lists
Ngram lists
References for Chapter 3
Glossary for Chapter 3
Chapter 4. Understanding Your Data
Open Source Tools for Chapter 4
Gnuplot
MatPlotLib
R, for statistical programming
Numpy
Scipy
ImageMagick
Displaying equations in LaTex
Normalized compression distance
Pearson's correlation
The ridiculously simple dot product
References for Chapter 4
Glossary for Chapter 4
Chapter 5. Identifying and Deidentifying Data
Open Source Tools for Chapter 5
Pseudorandom number generators
UUID
Encryption and decryption with OpenSSL
One-way hash implementations
Steganography
References for Chapter 5
Glossary for Chapter 5
Chapter 6. Giving Meaning to Data
Open Source Tools for Chapter 6
Syntax for triples
RDF Schema
RDF parsers
Visualizing class relationships
References for Chapter 6
Glossary for Chapter 6
Chapter 7. Object-oriented data
Persistent data
Persistence is the ability of data to outlive the program that produced it.
SQLite databases
References for Chapter 7
Glossary for Chapter 7
Chapter 8. Problem simplification
Open Source Tools for Chapter 8
Burrows Wheeler transform
Winnowing and chaffing
References for Chapter 8
Glossary for Chapter 8
Książek w koszyku: 0 szt.
Wartość zakupów: 0,00 zł
Gambit
Centrum Oprogramowania
i Szkoleń Sp. z o.o.
Al. Pokoju 29b/22-24
31-564 Kraków
Siedziba Księgarni
ul. Kordylewskiego 1
31-542 Kraków
+48 12 410 5991
+48 12 410 5987
+48 12 410 5989
Administratorem danych osobowych jest firma Gambit COiS Sp. z o.o. Na podany adres będzie wysyłany wyłącznie biuletyn informacyjny.
© Copyright 2012: GAMBIT COiS Sp. z o.o. Wszelkie prawa zastrzeżone.
Projekt i wykonanie: Alchemia Studio Reklamy