Thursday, 5 February 2015

Google Search Algorithm Update Yesterday

Google Update Brewing it looks like there is calculation overhauls occurring at Google seek now - it is difficult to say in the event that it is Panda, Penguin or something random - at this time, it doesn't appear like Penguin however it may be Panda related. Google has not affirmed if there was a redesign yet it does appear to be something major touched down in the Google list items yesterday.


We have a great deal of discourse and jabber around it at Webmaster world and everything except one of the mechanized following apparatuses demonstrated critical changes.

Friday, 26 September 2014

Panda 4.1 — Google’s 27th Panda Update — Is Rolling Out



Google has announced that the latest version of its Panda Update — a filter designed to penalize “thin” or poor content from ranking well — has been released.
Google said in a post on Google+ that a “slow rollout” began earlier this week and will continue into next week, before being complete. Google said that depending on location, about 3%-to-5% of search queries will be affected.
Different about this latest release? Google says it’s supposed to be more precise and will allow more high-quality small and medium-sized sites to rank better. From the post:
Based on user (and webmaster!) feedback, we’ve been able to discover a few more signals to help Panda identify low-quality content more precisely. This results in a greater diversity of high-quality small- and medium-sized sites ranking higher, which is nice.
New Chance for Some; New Penalty for Others
The rollout means anyone who was penalized by Panda in the last update has a chance to emerge, if they made the right changes. So if you were hit by Panda, made alterations to your site, you’ll know by the end of next week if those were good enough, if you see an increase in traffic.
The rollout also means that new sites not previously hit by Panda might get impacted. If you’ve seen a sudden traffic drop from Google this week, or note one in the coming days, then this latest Panda Update is likely to blame.

About That Number
Why are we calling it Panda 4.1? Well, Google itself called the last one Panda 4.0 and deemed it a major update. This isn’t as big of a change, so we’re going with Panda 4.1.
We actually prefer to number these updates in the order that they’ve happened, because trying to determine if something is a “major” or “minor” Panda Update is imprecise and lead to numbering absurdities like having a Panda 3.92 Update.
But since Google called the last one Panda 4.0, we went with that name — and we’ll continue on with the old-fashioned numbering system unless it gets absurd again.
For the record, here’s the list of confirmed Panda Updates, with some of the major changes called out with their AKA (also known as) names:

Panda Update 1, AKA
Panda 1.0, Feb. 24, 2011 (11.8% of queries; announced; English in US only)
Panda Update 2, AKA
Panda 2.0, April 11, 2011 (2% of queries; announced; rolled out in English internationally)
Panda Update 3, May 10, 2011 (no change given; confirmed, not announced)
Panda Update 4, June 16, 2011 (no change given; confirmed, not announced)
Panda Update 5, July 23, 2011 (no change given; confirmed, not announced)
Panda Update 6, Aug. 12, 2011 (6-9% of queries in many non-English languages; announced)
Panda Update 7, Sept. 28, 2011 (no change given; confirmed, not announced)
Panda Update 8 AKA
Panda 3.0, Oct. 19, 2011 (about 2% of queries; belatedly confirmed)
Panda Update 9, Nov. 18, 2011: (less than 1% of queries; announced)
Panda Update 10, Jan. 18, 2012 (no change given; confirmed, not announced)
Panda Update 11, Feb. 27, 2012 (no change given; announced)
Panda Update 12, March 23, 2012 (about 1.6% of queries impacted; announced)
Panda Update 13, April 19, 2012 (no change given; belatedly revealed)
Panda Update 14, April 27, 2012: (no change given; confirmed; first update within days of another)
Panda Update 15, June 9, 2012: (1% of queries; belatedly announced)
Panda Update 16, June 25, 2012: (about 1% of queries; announced)
Panda Update 17, July 24, 2012:(about 1% of queries; announced)
Panda Update 18, Aug. 20, 2012: (about 1% of queries; belatedly announced)
Panda Update 19, Sept. 18, 2012: (less than 0.7% of queries; announced)
Panda Update 20, Sept. 27, 2012 (2.4% English queries, impacted, belatedly announced
Panda Update 21, Nov. 5, 2012 (1.1% of English-language queries in US; 0.4% worldwide; confirmed, not announced)
Panda Update 22, Nov. 21, 2012 (0.8% of English queries were affected; confirmed, not announced)
Panda Update 23, Dec. 21, 2012 (1.3% of English queries were affected; confirmed, announced)
Panda Update 24, Jan. 22, 2013 (1.2% of English queries were affected; confirmed, announced)
Panda Update 25, March 15, 2013 (confirmed as coming; not confirmed as having happened)
Panda Update 26 AKA
Panda 4.0, May 20, 2014 (7.5% of English queries were affected; confirmed, announced)
Panda Update 27 AKA
Panda 4.1, Sept. 25, 2014 (3-5% of queries were affected; confirmed, announced)
The latest update comes four months after the last, which suggests that this might be a new quarterly cycle that we’re on. Panda had been updated on a roughly monthly basis during 2012. In 2013, most of the year saw no update at all.
Of course, there could have been unannounced releases of Panda that have happened. The list above is only for those that have been confirmed by Google.

Thursday, 19 December 2013

Improving URL removals on third-party sites

Content on the Internet changes or disappears, and occasionally it's helpful to have search results for it updated quickly. Today we launched our improved public URL removal tool to make it easier to request updates based on changes on other people's websites. You can find it at

https://www.google.com/webmasters/tools/removals




This tool is useful for removals on other peoples' websites. You could use this tool if a page has been removed completely, or if it was just changed and you need to have the snippet & cached page removed. If you're the webmaster of the site, then using the Webmaster Tools URL removal feature is faster & easier.

How to request a page be removed from search results


If the page itself was removed completely, you can request that it's removed from Google's search results. For this, it's important that the page returns the proper HTTP result code (403, 404, or 410), has a noindex robots meta tag, or is blocked by the robots.txt (blocking via robots.txt may not prevent indexing of the URL permanently). You can check the HTTP result code with a HTTP header checker. While we attempt to recognize "soft-404" errors, having the website use a clear response code is always preferred. Here's how to submit a page for removal:

  1.     Enter the URL of the page. As before, this needs to be the exact URL as indexed in our search results. Here's how to find the URL.
  2.     The analysis tool will confirm that the page is gone. Confirm the request to complete the submission.
  3.     There's no step three!

How to request a page's cache & snippet be removed from search results

If the page wasn't removed, you can also use this tool to let us know that a text on a page (such as a name) has been removed or changed. It'll remove the snippet & cached page in Google's search results until our systems have been able to reprocess the page completely (it won't affect title or ranking). In addition to the page's URL, you'll need at least one word that used to be on the page but is now removed. You can learn more about cache removals in our Help Center.
  1.     Enter the URL of the page which has changed. This needs to be the exact URL as indexed in our search results. Here's how to find the URL.
  2.     Confirm that the page has been updated or removed, and confirm that the cache & snippet are outdated (do not match the current content).
  3.     Now, enter a word that no longer appears on the live page, but which is still visible in the cache or snippet. See our previous blog post on removals for more details.


You can find out more about URL removals in our Help Center, as well as in our earlier blog posts on removing URLs & directories, removing & updating cached content, removing content you don't own, and tracking requests + what not to remove.

We hope these changes make it easier for you to submit removal requests! We welcome your feedback in our removals help forum category, where other users may also be able to help with more complicated removal issues.

Source: http://googlewebmastercentral.blogspot.in/2013/12/improving-url-removals-on-third-party.html

Friday, 25 October 2013

How to Google Search Engine Work


Learn how Google discovers, crawls, and serves web pages


When you sit down at your computer and do a Google search, you're almost instantly presented with a list of results from all over the web. How does Google find web pages matching your query, and determine the order of search results?

In the simplest terms, you could think of searching the web as looking in a very large book with an impressive index telling you exactly where everything is located. When you perform a Google search, our programs check our index to determine the most relevant search results to be returned ("served") to you.

The three key processes in delivering search results to you are:


Crawling: Does Google know about your site? Can we find it?
Indexing: Can Google index your site?
Serving: Does the site have good and useful content that is relevant to the user's search?
Crawling

Crawling is the process by which Googlebot discovers new and updated pages to be added to the Google index.

We use a huge set of computers to fetch (or "crawl") billions of pages on the web. The program that does the fetching is called Googlebot (also known as a robot, bot, or spider). Googlebot uses an algorithmic process: computer programs determine which sites to crawl, how often, and how many pages to fetch from each site.

Google's crawl process begins with a list of web page URLs, generated from previous crawl processes, and augmented with Sitemap data provided by webmasters. As Googlebot visits each of these websites it detects links on each page and adds them to its list of pages to crawl. New sites, changes to existing sites, and dead links are noted and used to update the Google index.

Google doesn't accept payment to crawl a site more frequently, and we keep the search side of our business separate from our revenue-generating AdWords service.

Indexing


Googlebot processes each of the pages it crawls in order to compile a massive index of all the words it sees and their location on each page. In addition, we process information included in key content tags and attributes, such as Title tags and ALT attributes. Googlebot can process many, but not all, content types. For example, we cannot process the content of some rich media files or dynamic pages.

Serving results
When a user enters a query, our machines search the index for matching pages and return the results we believe are the most relevant to the user. Relevancy is determined by over 200 factors, one of which is the PageRank for a given page. PageRank is the measure of the importance of a page based on the incoming links from other pages. In simple terms, each link to a page on your site from another site adds to your site's PageRank. Not all links are equal: Google works hard to improve the user experience by identifying spam links and other practices that negatively impact search results. The best types of links are those that are given based on the quality of your content.

In order for your site to rank well in search results pages, it's important to make sure that Google can crawl and index your site correctly. Our Webmaster Guidelines outline some best practices that can help you avoid common pitfalls and improve your site's ranking.

Google's Did you mean and Google Autocomplete features are designed to help users save time by displaying related terms, common misspellings, and popular queries. Like our google.com search results, the keywords used by these features are automatically generated by our web crawlers and search algorithms. We display these predictions only when we think they might save the user time. If a site ranks well for a keyword, it's because we've algorithmically determined that its content is more relevant to the user's query.

Monday, 14 October 2013

Use of Robots in SEO


Optimal Format:
Robots.txt needs to be placed in the top-level directory of a web server in order to be useful. Example: http:/www.example.com/robots.txt
What is Robots.txt?
The Robots Exclusion Protocol (REP) is a group of web standards that regulate web robot behavior and search engine indexing. The REP consists of the following:
The original REP from 1994, extended 1997, defining crawler directives for robots.txt. Some search engines support extensions like URI patterns (wild cards).
Its extension from 1996 defining indexer directives (REP tags) for use in the robots meta element, also known as "robots meta tag." Meanwhile, search engines support additional REP tags with an X-Robots-Tag. Webmasters can apply REP tags in the HTTP header of non-HTML resources like PDF documents or images.
The Microformat rel-nofollow from 2005 defining how search engines should handle links where the A Element's REL attribute contains the value "nofollow."
Robots Exclusion Protocol Tags:
Applied to an URI, REP tags (noindex, nofollow, unavailable_after) steer particular tasks of indexers, and in some cases (nosnippet, noarchive, noodp) even query engines at runtime of a search query. Other than with crawler directives, each search engine interprets REP tags differently. For example, Google wipes out even URL-only listings and ODP references on their SERPs when a resource is tagged with "noindex," but Bing sometimes lists such external references to forbidden URLs on their SERPs. Since REP tags can be supplied in META elements of X/HTML contents as well as in HTTP headers of any web object, the consensus is that contents of X-Robots-Tags should overrule conflicting directives found in META elements.
Microformats
Indexer directives put as micro-formats will overrule page settings for particular HTML elements. For example, when a page's X-Robots-Tag states "follow" (there's no "nofollow" value), the rel-nofollow directive of a particular A element (link) wins.
Although robots.txt lacks indexer directives, it is possible to set indexer directives for groups of URIs with server sided scripts acting on site level that apply X-Robots-Tags to requested resources. This method requires programming skills and good understanding of web servers and the HTTP protocol.
Pattern Matching
Google and Bing both honor two regular expressions that can be used to identify pages or sub-folders that an SEO wants excluded. These two characters are the asterisk (*) and the dollar sign ($).

·         - which is a wildcard that represents any sequence of characters
·         $ - which matches the end of the URL
Public Information
The robots.txt file is public—be aware that a robots.txt file is a publicly available file. Anyone can see what sections of a server the webmaster has blocked the engines from. This means that if an SEO has private user information that they don’t want publicly searchable, they should use a more secure approach—such as password protection—to keep visitors from viewing any confidential pages they don't want indexed.
Important Rules
In most cases, meta robots with parameters "noindex, follow" should be employed as a way to to restrict crawling or indexation.
It is important to note that malicious crawlers are likely to completely ignore robots.txt and as such, this protocol does not make a good security mechanism.
Only one "Disallow:" line is allowed for each URL.
Each subdomain on a root domain uses separate robots.txt files.
Google and Bing accept two specific regular expression characters for pattern exclusion (* and $).
The filename of robots.txt is case sensitive. Use "robots.txt", not "Robots.TXT."
Spacing is not an accepted way to separate query parameters. For example, "/category/ /product page" would not be honored by robots.txt.
SEO Best Practice:
Blocking Page
There are a few ways to block search engines from accessing a given domain:
Block with Robots.txt
This tells the engines not to crawl the given URL, but that they may keep the page in the index and display it in in results. (See image of Google results page below.)
Block with Meta NoIndex
This tells engines they can visit, but are not allowed to display the URL in results. This is the recommended method.
Block by Nofollowing Links
This is almost always a poor tactic. Using this method, it is still possible for the search engines to discover pages in other ways: through browser toolbars, links from other pages, analytics, and more.

Why Meta Robots is better than Robots.txt
Below is an example of about.com's robots.txt file. Notice that they are blocking the directory /library/nosearch/. robots.txt
                                                          

Now notice what happens when the URL is searched for in Google.

                                                         
blocked in Google
Google has 2,760 pages from that "disallowed" directory. The engine hasn't crawled these URLs, so it appears as a URL rather than a traditional listing.
This becomes a problem when these pages accumulate links. Those pages than can accumulate link juice (ranking power) and other query-independent ranking metrics (like popularity and trust), but these pages can't pass these benefits to any other pages since the links on them don't ever get crawled.
Google can't see links

In order to exclude individual pages from search engine indices, the noindex meta tag <meta name="robots" content="noindex"> is actually superior to robots.txt.

Sunday, 6 October 2013

Google Hummingbird Algorithm

Google has a new search algorithm, the system it uses to sort through all the information it has when you search and come back with answers. It’s called “Hummingbird” and below, what we know about it so far.
What’s a “search algorithm?”
That’s a technical term for what you can think of as a recipe that Google uses to sort through the billions of web pages and other information it has, in order to return what it believes are the best answers.
What’s “Hummingbird?”
It’s the name of the new search algorithm that Google is using, one that Google says should return better results.
So that “PageRank” algorithm is dead?
No. PageRank is one of over 200 major “ingredients” that go into the Hummingbird recipe. Hummingbird looks at PageRank — how important links to a page are deemed to be — along with other factors like whether Google believes a page is of good quality, the words used on it and many other things (see our Periodic Table Of SEO Success Factors for a better sense of some of these).
Why is it called Hummingbird?
Google told us the name come from being “precise and fast.”
When did Hummingbird start? Today?
Google started using Hummingbird about a month ago, it said. Google only announced the change today.
What does it mean that Hummingbird is now being used?
Think of a car built in the 1950s. It might have a great engine, but it might also be an engine that lacks things like fuel injection or be unable to use unleaded fuel. When Google switched to Hummingbird, it’s as if it dropped the old engine out of a car and put in a new one. It also did this so quickly that no one really noticed the switch.
When’s the last time Google replaced its algorithm this way?
Google struggled to recall when any type of major change like this last happened. In 2010, the “Caffeine Update” was a huge change. But that was also a change mostly meant to help Google better gather information (indexing) rather than sorting through the information. Google search chief Amit Singhal told me that perhaps 2001, when he first joined the company, was the last time the algorithm was so dramatically rewritten.
What about all these Penguin, Panda and other “updates” — haven’t those been changes to the algorithm?
Panda, Penguin and other updates were changes to parts of the old algorithm, but not an entire replacement of the whole. Think of it again like an engine. Those things were as if the engine received a new oil filter or had an improved pump put in. Hummingbird is a brand new engine, though it continues to use some of the same parts of the old, like Penguin and Panda
The new engine is using old parts?
Yes. And no. Some of the parts are perfectly good, so there was no reason to toss them out. Other parts are constantly being replaced. In general, Hummingbird — Google says — is a new engine built on both existing and new parts, organized in a way to especially serve the search demands of today, rather than one created for the needs of ten years ago, with the technologies back then.
What type of “new” search activity does Hummingbird help?
“Conversational search” is one of the biggest examples Google gave. People, when speaking searches, may find it more useful to have a conversation.
“What’s the closest place to buy the iPhone 5s to my home?” A traditional search engine might focus on finding matches for words — finding a page that says “buy” and “iPhone 5s,” for example.
Hummingbird should better focus on the meaning behind the words. It may better understand the actual location of your home, if you’ve shared that with Google. It might understand that “place” means you want a brick-and-mortar store. It might get that “iPhone 5s” is a particular type of electronic device carried by certain stores. Knowing all these meanings may help Google go beyond just finding pages with matching words.
In particular, Google said that Hummingbird is paying more attention to each word in a query, ensuring that the whole query — the whole sentence or conversation or meaning — is taken into account, rather than particular words. The goal is that pages matching the meaning do better, rather than pages matching just a few words.
I thought Google did this conversational search stuff already!
It does (see Google’s Impressive “Conversational Search” Goes Live On Chrome), but it had only been doing it really within its Knowledge Graph answers. Hummingbird is designed to apply the meaning technology to billions of pages from across the web, in addition to Knowledge Graph facts, which may bring back better results.
Does it really work? Any before-and-afters?
We don’t know. There’s no way to do a “before-and-after” ourselves, now. Pretty much, we only have Google’s word that Hummingbird is improving things. However, Google did offer some before-and-after examples of its own, that it says shows Hummingbird improvements.
A search for “acid reflux prescription” used to list a lot of drugs (such as this, Google said), which might not be necessarily be the best way to treat the disease. Now, Google says results have information about treatment in general, including whether you even need drugs, such as this as one of the listings.
A search for “pay your bills through citizens bank and trust bank” used to bring up the home page for Citizens Bank but now should return the specific page about paying bills
A search for “pizza hut calories per slice” used to list an answer like this, Google said, but not one from Pizza Hut. Now, it lists this answer directly from Pizza Hut itself, Google says.
Could it be making Google worse?
Almost certainly not. While we can’t say that Google’s gotten better, we do know that Hummingbird — if it has indeed been used for the past month — hasn’t sparked any wave of consumers complaining that Google’s results suddenly got bad. People complain when things get worse; they generally don’t notice when things improve.
Does this mean SEO is dead?
No, SEO is not yet again dead. In fact, Google’s saying there’s nothing new or different SEOs or publishers need to worry about. Guidance remains the same, it says: have original, high-quality content. Signals that have been important in the past remain important; Hummingbird just allows Google to process them in new and hopefully better ways.
Does this mean I’m going to lose traffic from Google?
If you haven’t in the past month, well, you came through Hummingbird unscathed. After all, it went live about a month ago. If you were going to have problems with it, you would have known by now.
By and large, there’s been no major outcry among publishers that they’ve lost rankings. This seems to support Google saying this is very much a query-by-query effect, one that may improve particular searches — particularly complex ones — rather than something that hits “head” terms that can, in turn, cause major traffic shifts.
But I did lose traffic!
Perhaps it was due to Hummingbird, but Google stressed that it could also be due to some of the other parts of its algorithm, which are always being changed, tweaked or improved. There’s no way to know.
How do you know all this stuff?
Google shared some of it at its press event today, and then I talked with two of Google’s top search execs, Amit Singhal and Ben Gomes, after the event for more details. I also hope to do a more formal look at the changes from those conversations in the near future. But for now, hopefully you’ve found this quick FAQ based on those conversations to be helpful.
By the way, another term for the “meaning” connections that Hummingbird does is “entity search,” and we have an entire panel on that at our SMX East search marketing show in New York City, next week. The Coming “Entity Search” Revolution session is part of an entire “Semantic Search” track that also gets into ways search engines are discovering meanings behind words. Learn more about the track and the entire show on the agenda page.

Google Penguin Update 2013

Penguin 5, With The Penguin 2.1 Spam-Filtering Algorithm, Is Now Live

angry-penguin-200pxThe fifth confirmed release of Google’s “Penguin” spam fighting algorithm is live. That makes it Penguin 5 by our count. But since this Penguin update is using a slightly improved version of Google’s “Penguin 2″ second-generation technology, Google itself is calling it “Penguin 2.1.” Don’t worry. We’ll explain the numbering nonsense below, as well as what this all means for publishers.

New Version Of Penguin Live Today
The head of Google’s web spam team, Matt Cutts, shared the news on Twitter, saying the latest release would impact about 1 percent of all searches:


The link that Cutts points at, by the way, explains what Penguin was when it was first launched. It doesn’t cover anything new or changed with the latest release.

Previous Updates
Here are all the confirmed releases of Penguin to date:

Penguin 1 on April 24, 2012 (impacting around 3.1% of queries)
Penguin 2 on May 26, 2012 (impacting less than 0.1%)
Penguin 3 on October 5, 2012 (impacting around 0.3% of queries)
Penguin 4 (AKA Penguin 2.0) on May 22, 2013 (impacting 2.3% of queries)
Penguin 5 (AKA Penguin 2.1) on Oct. 4, 2013 (impacting around 1% of queries)
Why Penguin 2.1 AND Penguin 5?
If us talking about Penguin 5 in reference to something Google is calling Penguin 2.1 hurts your head, believe us, it hurts ours, too. But you can pin that blame back on Google. Here’s why.

When Google started releasing its “Panda” algorithm designed to fight low-quality content, it called the first one simply “Panda.” So when the second came out, people referred to that as “Panda 2.” When the third came out, people called that Panda 3 — causing Google to say that the third release, because it was relatively minor, really only should be called Panda 2.1 — the “point” being used to indicate how much a minor change it was.

Google eventually — and belatedly — indicated that a Panda 3 release happened, causing the numbering to move into Panda 3.0, Panda 3.1 and so on until there had been so many “minor” updates that we having to resort to going further out in decimal places to things like Panda 3.92.

That caused us here at Search Engine Land to decide it would be easier all around if we just numbered any confirmed update sequentially, in order of when they came. No matter how “big” or “small” an update might be, we’d just give it the next number on the list: Penguin 1, Penguin 2, Penguin 3 and so on.

Thanks For The Headache, Google
That worked out fine until Penguin 4, because Google typically didn’t give these updates numbers itself. It just said there was an update, and left it to us or others to attach a number to it.

But when Penguin 4 arrived, Google really wanted to stress that it was using what it deemed to be a major, next-generation change in how Penguin works. So, Google called it Penguin 2, despite all the references to a Penguin 2 already being out there, despite the fact it hadn’t really numbered many of these various updates before.

Today’s update, as can be seen above, has been dubbed Penguin 2.1 — so supposedly, it’s a relatively minor change to the previous Penguin filter that was being used. However, if it’s impacting around 1 percent of queries as Google says, that means it is more significant than what Google might have considered to be similar “minor” updates of Penguin 1.1 and Penguin 1.2.

What Is Penguin Again? And How Do I Deal With It?
For those new to the whole “Penguin” concept, Penguin is a part of Google’s overall search algorithm that periodically looks for sites that are deemed to be spamming Google’s search results but somehow still ranking well. In particular, it goes after sites that may have purchased paid links.

If you were hit by Penguin, you’ll likely know if you see a marked drop in traffic that begins today or tomorrow. To recover, you’ll need to do things like disavow bad links or manually have those removed. Filing a reconsideration request doesn’t help, because Penguin is an automated process. Until it sees that what it considers to be bad has been removed, you don’t recover.

If you were previously hit by Penguin and have taken actions hopefully meant to fix that, today and tomorrow are the days to watch. If you see an improvement in traffic, that’s a sign that you’ve escaped Penguin.

Here are previous articles with more on Penguin recovery and how it and other filters work as part of the ranking system

The Google Dance Is Back
Two Weeks In, Google Talks Penguin Update, Ways To Recover & Negative SEO
How Google’s Disavow Links Tool Can Remove Penalties
Why Asking StumbleUpon To Remove Your Links Is Dumb
Google’s New Stance On Negative SEO: “Works Hard To Prevent” It
Still Seeing Post-Penguin Web Spam In Google Results? Let Google Know
Big Brand SEO & Penguin 2.0
Demystifying Link Disavowals, Penalties & More
What About Hummingbird?
If you’re wondering about how Penguin fits into that new Google Hummingbird algorithm  you may have heard about, think of Penguin as a part of Hummingbird, not as a replacement for it.

Hummingbird is like Google’s entire ranking engine, whereas Penguin is like a small part of that engine, a filter that is removed and periodically replaced with what Google considers to be a better filter to help keep out bad stuff.

To understand more about that relationship and Hummingbird in general, see our post below: