In the previous post we outlined the way that we are forming the commands and where we are creating some directories. We will be doing three search indexes and merging them - the main subject is “nanaimo”, with tastes of “diving” and “music”.
(more…)

It has become apparent that a sort of template is needed - an explanation of what is happening plus an example of the commands in a generic form.  First, an outline of the project, we want to keep in mind the eventual process to follow once we progress past the initial stage of making sure that the program is working, etc. We’ll discuss what we are doing and a complete list of commands that is generally copy/paste ready will be found at the end of this post.
(more…)

Nutch (the program we use to create our search sites) is capable of acting in “distributed” fashion. Nutch is integrated with a program called “Hadoop” that provides the “distributed” functionality. In the case of the geocentric search engines there exists the possibility that Nanaimo along with another 10 cities on Vancouver Island might be approached to contribute their resources to an overall “Vancouver Island” vertical search engine, or that 200 towns and cities in BC might want to all contribute to an overall “BC” search engine. Hadoop makes it possible.
(more…)

After this post there won’t be any explanation of the following terms or explanation of the following general concepts. Nutch is an open-source search engine. It is programmed using the JAVA language and makes use of a program called “Tomcat” (”Apache Tomcat” actually).

Nutch consists of a crawler, a parser, a database to store webpages, an indexer and a user interface. The crawler “crawls” webpages, or “fetches” them. It knows which pages to crawl based on a “fetchlist”. The URLs of the webpages to be crawled are “injected” into the fetchlist. The parser(s) examine the webpages and store them in the database in the proper form so that they can be “indexed” by the indexer.
(more…)

The normal behavior of an agent that crawls (fetches) web pages is to get the webpage, extract the links on that page to other webpages and then go and get those linked pages, extract the links from them, repeat. The goal of Google, Yahoo, et al is to have crawled and indexed every page that exists. Google probably has close to 10 billion pages indexed. What we want is only the minimum amount of pages necessary to provide the best results for the searcher, always with the understanding that if they are looking for something really specific then Google is more likely to have that page indexed - hopefully the search wording can be defined well enough that the result will not be #800, behind a seemingly infinite number of hotel reservation and viagra sales sites.
(more…)

When we say that the search results of our vertical search engine are better than Google et al what we mean is that there is a “flavor” to the results. Our results will never be more comprehensive than theirs because we don’t index anywhere near the number of webpages that they do. A vertical search engine for a city of 100,000 people might well have less than 100,000 pages indexed as opposed to Googles 5 billion. Of couse, most people rarely delve down to the #818 result using Google, the point being that all those pages that Google has tend to obfuscate a lot of the pages that have real relevance.
(more…)

We are in the process of introducing some technologies for geogeneric sites, for the most part right now those technologies are WordPress plugins for general use over all geo-generics, but also a few generic-specificic ones and specifically one for the “search” generic.  We will be demonstrating eventually at “NanaimoSearch(dot)com. We are in the process of developing a vertical search engine process that will give geos such as “NanaimoSearch(dot)com”, “ChicagoSearch(dotcom”, “LosAngelesSearch(dot)com”, etc can use to the end of providing better, more relevent results than the major search engines such as Google, Yahoo and MSN.
(more…)

The assumption is that an economy that is in trouble will necessarily put more importance on the more local economies. A world economy crisis will “degenerate” into national economies and a national economy crisis will result in more individual reliance on the local economy. Eventually it will break down into geographically based economies that would range in locality from “everything west of the Rocky Mountains” to “only things in my immediate neighborhood” and encompassing everything in between.
(more…)

I’ve been doing a little research lately concerning a whole bunch of things which end up being associated with each other and I thought I’d put down a few thoughts about what I’ve noticed.

There exists a company called “Pay Per Post” and what they are is a place where bloggers get connected with people who want to give them money for blogging about something specific. I looked the site over pretty thoroughly - I joined up - the prices paid range from $5 a post to $156 a post, at least there are some bloggers who don’t take on work for less, according to their profiles. I looked at many samples of their work and admittedly the $5 posts are pretty bad but the more expensive ones are really good.
(more…)

We’ve been modifying the wp-shoppingcart plugin which can be found at “instinct.co.nz”. We are making it so that it can be used for our purposes which the standard version isn’t. We have a few purposes which we need a shopping cart for.
(more…)

We have a lot of data concerning the registrations of geo-generic dot coms. We took the top 600 cities in the USA/Canada/Australia/UK/NZ and 300 generics which makes about 200,000 domains. We queried whois to get owners, registration dates, etc and put the results into a database which we can make queries on.

This yields some interesting results like which generics were registered first (as a group), which cities had a given number of geo-generics registered first or by a certain date and other interesting stuff. We have published some of our findings on other sites but we intend to put them on this site as well and will shortly.

We use wordpress and a whole bunch of wordpress plugins to develop our sites. Other people are using our setup and the plugins that we develop and people have asked us which plugins we are using in our efforts. So here is a partial list which we will add to as necessary.
(more…)

« Previous PageNext Page »