In my previous post, I wrote about git work flows. Now I will going to try out simple 'Feature Branch Workflow'.1. I pull down the latest changes from mastergit checkout mastergit pull origin master2. I make branch to make changes git checkout -b new-feature3. Now I am working on the feature4. I keep my feature branch fresh and up to ...
There are many Workflows for GitCentralized WorkflowFeature Branch WorkflowGitflow WorkflowForking WorkflowIn Centralized Workflow, Team develop projects in the exact same way as they do with Subversion. Git to power your development workflow presents a few advantages over SVN. First, it gives every developer their own local copy of the entire project. This isolated environment lets each developer work independently of ...
Different data model is need for different chart types. This post is basically covering google chart types and support of data models. Bar charts and Column chartEach bar of the chat represent the value of elements of x-axis. Bar charts display tooltips when the user hovers over the data. For a vertical version of this chart called the 'column chart'.Each ...
In Google chart some different chart type contains different format of data sets Google Chart Tools is with their default setting and all customizations are optional. Every chart exposes a number of options that customize its look and feel. These options are expressed as name:value pairs in the options object. eg:visualization supports a colors option that lets you specify "colors": ...
Google Charts provides many chart types that is useful for data visualization. Charts are highly interactive and expose events that let you connect them to create complex dashboards. Charts are rendered using HTML5/SVG technology to provide cross-browser compatibility. All chart types are populated with data using the DataTable class, making it easy to switch between chart types. Google chart contains ...
Few days I was working for pattern mining on huge files and came across with millions of pattern (even different length from 2 to 150). Now I am looking for regex generation algorithms and came across by ‘Grammar induction’ which we knew some thing when in university time. But this is much more to do. Grammar induction Grammar induction, also ...
'configuration files' or 'config files' configure the initial settings for some computer programs. They are used for user applications. Files can be changed as needed. An administrator can control which protected resources an application can access, which versions of assemblies an application will use, and where remote applications and objects are located. It is important to have config files in ...
An n-gram is a contiguous sequence of n items from a given sequence of text or speech. The items can be syllables, letters, words or base pairs according to the application. n-grams may also be called shingles. Tokenization My first post was mainly on this. 1 from nltk.tokenize import RegexpTokenizer2 3 tokenizer = RegexpTokenizer("[a-zA-Z'`]+")4 #skipping the numbers in here, include ...
Previous post was basically about installing and introduction for NLTK and searching text with NLTK basic functions. This post main going on ‘Texts as Lists of Words’ as text is nothing more than a sequence of words and punctuation. Frequency Distribution also visited at the end of this post. sent1 = ['Today', 'I', 'call', 'James', '.'] len(sent1)—> 4 Concatenation combines ...
What is NLTK? Natural Language Toolkit (NLTK) is a leading platform for building Python programs to work with human language data (Natural Language Processing). It is accompanied by a book that explains the underlying concepts behind the language processing tasks supported by the toolkit. NLTK is intended to support research and teaching in NLP or closely related areas, including empirical ...
Affinity Propagation (AP)[1] is a relatively new clustering algorithm based on the concept of "message passing" between data points. AP does not require the number of clusters to be determined or estimated before running the algorithm. “An algorithm that identifies exemplars among data points and forms clusters of datapoints around these exemplars. It operates by simultaneously considering all data point ...
Pre - Requirements java 1.7 maven 3.2.x or 3.3.x nodejs cywin Here is my version in windows8 (64 bit) Incubator-zeppelin is build success. Few issues you can face with windows ERROR 01 [ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:0.0.23:bower (bower install) on project zeppelin-web: Failed to run task: 'bower --allow-root install' failed. (error code 1) -> [Help 1] you can find ...
Previous post is introduction for zeppelin notebook. Here we will more more detail view where it will used for researcher. Using shell interpreter we can download / retrieve data sets / files from remote server or internet. Then using Scala in Spark to make class from that data and then used SQL to play with the data. You can analysis ...
Here is my previous post to build zeppelin from source. This post will take you a tour on “notebook feature of zeppelin”. NoteBook contain with note. Note will have paragraphs. 1. Start you zeppelin by entering /incubator-zeppelin $ ./bin/zeppelin-daemon.sh start 2. Go to localhost:8080 and click on ‘NoteBook’ in top menu. Then click on ‘Create new note’. Now you will ...
Data binding is the process that establishes a connection between the application UI (User Interface) and model/business logic. In JavaScript world we used 'Backbone.js', 'KnockoutJS', 'BindingJS' and 'AngularJS'. This post will go through over the data binding in Angular. Traditional Data Binding System Most web frameworks focus on one-way data binding and classical template systems are only one direction. they ...
AngularJS, is an open-source web application framework maintained by Google and a community of individual developers. It address many of the challenges encountered in developing single-page applications. Angular [1] is built around the belief that declarative programming should be used for building user interfaces and connecting software components, while imperative programming is better suited to defining an application's business logic. ...
React is a open source UI library developed at Facebook to facilitate the creation of interactive and reusable UI components. It is not only does it perform on the client side, but it can also be rendered server side, and they can work together inter-operably. React has pluggable back-ends so it can be used to target the DOM, HTML, canvas, ...
Density-based spatial clustering of applications with noise (DBSCAN)[1] is a density-based clustering algorithm. It gives a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions. In 2014, the algorithm was awarded the test of time award at the leading ...
Scikit-learn is an open source machine learning library for the Python programming language. It features various classification, regression and clustering algorithms ,support vector machines, logistic regression, naive Bayes, random forests, gradient boosting, k-means DBSCAN, Decision Trees, Gaussian Process for ML, Manifold learning, Gaussian Mixture Models, Model Selection, Nearest Neighbors, Semi Supervised Classification, Feature Selection etc. I was working on them ...
Introduction The Apache CouchDB project had announced a Developer Preview release of its CouchDB 2.0. The Developer Preview 2.0 brings all-new clustering technology to the Open Source NoSQL database, enabling a range of big data capabilities that include being able to store, replicate, sync, and process large amounts of data distributed across individual servers, data centers, and geographical regions in ...