Custom Protein Expression
Functional Protein, 95% Purity
Fast turnaround (2-3 weeks)
1-10 mg from E. Coli expression
for only $1790
No outsourcing to China or India


in vitro Protein Synthesis
Full-length protein in 1 week
95% Purity, From any gene
(toxic or membrane proteins)
Isotopic Labelling for NMR

50 µg~3 mg, $390 to $2500
Excellgen

Perl

Perl http log file parser

This is my http log file parse. It will tell you who is crawling your site. Hope you like it

#!/usr/local/perl
use strict;
my @files=<*_access.log*>;

my %address=();
my %agents=();
foreach my $file (@files)

{
    my $ggl=0;
    my $yahoo=0;
    my $cuil=0;
    my $twiceler=0;
    my $Jeeves=0;
    my $Yandex=0;
    my $legs=0;
    my $Baiduspider=0;
    my $dotnetdotcom=0;
    my $msn=0;
    my $seoprofiler=0;

    open (IN, “$file”);
    my $outfile=”$file.txt”;
    open (OUT, “>$outfile”);
    while (<IN>)
    {   
        my $orig_line=$_;
        my ($line) = $orig_line;
        #$line=~ s/\///g;
            my @arr= split (/\”/, $line);
            my $ip=shift(@arr);
            my $agent=pop(@arr);
            $agent=pop(@arr);
            @arr=split(/ /,$ip);
            $ip=shift(@arr);
            if  (exists $address{$ip})
            {
                 my $count= $address { $ip } ;
               
                $count++;
            #    print $ip .” “.$count.”\n”;
                 $address{$ip}=$count;
            }
            else
            {
                $address{$ip}=1;
                $agents{$ip}=$agent;
            }

        if ($line =~ /twiceler/ or $line =~ /cuil\.com/ or $line =~ /Yahoo\! Slurp/ or $line =~ /Googlebot/ or $line =~ /Ask Jeeves/ or
        $line =~ /Yandex/ or $line =~ /80legs/ or  $line =~ /Baiduspider/ or $line =~ /dotnetdotcom/ or $line=~ /seoprofiler/
        or $line=~ /msn.com/

)
        {

        #    print $line .”\n”;
            if ($line =~ /msn\.com/)
            {
                  $msn++;
            }
            if ($line =~ /twiceler/)
            {
                 $twiceler++;
            }
            if ($line =~ /cuil\.com/)
            {
                 $cuil++;
            }
            if ($line =~ /Yahoo\! Slurp/)
            {
                 $yahoo++;
            }
            if ($line =~ /Googlebot/)
            {
                  $ggl++;
            }
            if ($line =~ /Ask Jeeves/)
            {
                  $Jeeves++;
            }   
            if ($line =~ /Ask Jeeves/)
            {
                $Yandex++;
            }   
            if ($line =~ /80legs/)
            {
                $legs++;
            }   
            if ($line =~ /Baiduspider/)
            {
                $Baiduspider++;
            }           
            if ($line =~ /dotnetdotcom/)
            {
                $dotnetdotcom++;   
            }
            if ($line =~ /seoprofiler/)
            {
                $seoprofiler++;   
            }
        }
        else
        {
            print OUT $orig_line;
        }
    }
#    unlink ($file);
#    rename ($outfile, $file);
    print “google: $ggl, Yahoo: $yahoo, Cuil: $cuil, twiceler: $twiceler, Jeeves: $Jeeves, Yandex: $Yandex, legs: $legs\n”;
   print “Baiduspider: $Baiduspider, dotnetdotcom: $dotnetdotcom,   msn: $msn, seoprofiler: $seoprofiler \n”;
}

open (OUT, “>grant.txt”);
foreach my $key (sort hashValueDescendingNum  (keys(%address)))
{

        print OUT “$address{$key} \t $key\t $agents{$key}\n”;

}
close(OUT);

sub hashValueAscendingNum {
   $address{$a} <=> $address{$b};
}

sub hashValueDescendingNum {
   $address{$b} <=> $address{$a};
}

Comments

Web Link to NCBI Entrez Databases

PubMed

NLM currently leases PubMed journal citations, at no charge.

Retrieve PubMed Citations

Base URL: http://www.ncbi.nlm.nih.gov/pubmed

To retrieve results in HTML or text format use PubMed Unique Identifiers (PMID).

  • Retrieval parameters:

    • report=display format (DocSum is the default display, except for a single citation)

    • format=text (HTML is the default format.)

    • tool=resource

    • email=address

Example:

Retrieve by PMID in Abstract:

http://www.ncbi.nlm.nih.gov/pubmed/18276894

Retrieve by PMID in MEDLINE text format:

http://www.ncbi.nlm.nih.gov/pubmed/18276894,18276893?report=medline&format=text

Search PubMed

Example:

PubMed antioxidant chocolate citations

http://www.ncbi.nlm.nih.gov/pubmed?term=antioxidant+chocolate

  • activating Limits (PubMed Help):

    • cmd_current=Limits

    • pmfilter_filter name = filter value

  • turning off Limits

    • pubmedfilters=true

Examples:

PubMed:
PubMed hay fever citations published in 2006, display the first 50:
http://www.ncbi.nlm.nih.gov/pubmed?term=hay+fever+AND+2006[pdat]&dispmax=50

PubMed citations on AZT limited to the AIDS subset:
http://www.ncbi.nlm.nih.gov/pubmed?term=azt&cmd_current=Limits&pmfilter_Subsets=AIDS

To turn off PubMed Limits and search for hay fever displayed in the Abstract format:
http://www.ncbi.nlm.nih.gov/pubmed?term=hay+fever&pubmedfilters=true&report=abstract

PubMed Feature Pages

Other Entrez Databases

  • retrieve records in HTML or Text format using unique identifiers

  • search with terms

  • link to related records or neighbors

Retrieve

Base URL: http://www.ncbi.nlm.nih.gov/sites/entrez

To retrieve results in HTML or text format use unique identifiers (primary IDs). Use Search to retrieve by accession numbers.

Example:

Gene Full Report for GeneIDs 40048 & 847

http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=retrieve&db=gene&list_uids=40048,847&dopt=full_report

Example:

Protein sequence records in text format for GIs 9367031, 729567, 586553
http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=text&db=protein&dopt=genpept&uid=9367031&uid=729567&uid=586553

Search

Use search to create a web link for terms with or without Boolean operators. “Escape” spaces by converting them to plus signs (+), e.g., Biochem Soc Trans should be Biochem+Soc+Trans.

You may also use Details to generate a search URL.

Base URL: http://www.ncbi.nlm.nih.gov/sites/entrez

Example:
Protein records for AAC72193[accn] in the GenPept display:
http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=search&db=protein&term=AAC72193[accn]&doptcmdl=GenPept

Nucleotide records for COMT sequences in the Brief display:
http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=search&db=nucleotide&term=comt&doptcmdl=brief

OMIM:
OMIM records for the FBN1 gene in the Detailed display:
http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=search&db=omim&term=fbn1[gene]&doptcmdl=detailed

Tool

A string with no internal spaces that identifies the resource that is using Entrez links. This argument is used to help NCBI provide better service to third parties generating Entrez queries from programs. As with any query system, it is sometimes possible to ask the same question different ways, with different effects on performance. NCBI requests that developers sending batch requests include a constant ‘tool’ argument for all requests using the utilities.

Example: tool=resource

Email Address

If you choose to provide an email address we will use it to contact you if there are problems with your queries or if we are changing software interfaces that might specifically affect your requests. If you choose not to include an email address you can sign up for utilities-announce to receive general announcements.

Example: email=name@institution.org

Display Formats

Display Formats for Sample Entrez Databases

Note: Scripts/programs that import XML should use E-Utilities.

Database  Display Format
PubMed DocSum, Abstract, MEDLINE, XML
Nucleotide DocSum, Brief, GenBank, ASN1, FASTA, ExternalLink, XML, Graph, fasta_xml, gbc_xml
OMIM Detailed, Synopsis, Variants, ASN1, XML, ExternalLink 
Gene DocSum, Full_Report, ASN1, XML, Gene_Table, ExternalLink
Protein DocSum, Brief, GenPept, ASN1, FASTA, ExternalLink, XML, graph, fasta_xml, gpc_xml
Genome DocSum, Brief, ASN1, ExternalLink, XML, Protein Table, cDNA FASTA, Protein FASTA, Structural RNA Table, Contig Table
Structure DocSum, Brief
PopSet DocSum, Brief, ASN1, ExternalLink
Taxonomy DocSum, Brief, TxUidList, TxInfo, TxTree, ExternalLink, XML

Comments

How to use Perl LWP to download content and extract link

Use this Perl Script

#!/usr/bin/perl

# load LWP library:
use LWP::UserAgent;
use HTML::Parse;

# define a URL
my $url = ‘https://www.cnn.com/’;

# create UserAgent object
my $ua = new LWP::UserAgent;

# set a user agent (browser-id)
# $ua->agent(‘Mozilla/5.5 (compatible; MSIE 8; Windows NT 5.1)’);

# timeout:
$ua->timeout(15);

# proceed the request:
my $request = HTTP::Request->new(‘GET’);
$request->url($url);

my $response = $ua->request($request);

#
# responses:
#

# response code (like 200, 404, etc)
my $code = $response->code;

# headers (Server: Apache, Content-Type: text/html, …)
my $headers = $response->headers_as_string;

# HTML body:
my $body =  $response->content;

# print the website content:
# print $body;

# do some parsing:

my $parsed_html = HTML::Parse::parse_html($body);
for (@{ $parsed_html->extract_links(qw(a body img)) }) {
   
    # extract all links (a, body, img)
    my ($link) = @$_;
   
    print $link . “\n”;
}

Comments

Sponsored Links Lab Supply Mall http://www.labsupplymall.com

Functional Recombinant Proteins by in vitro Protein Synthesis, 3 days, 95% Purity
Full-length, high quality protein, high yield, high throughput, any genes (toxic, low GC content), membrane proteins, isotopic labelling. $598, $398
GeneExpresso DNA Transfection Reagent
Low Cytotoxicity, Higher Transfection Efficiency than Lipofectamine 2000. $188, $138
Protein expression & purification: E. Coli, insect and mammalian cells
Fast turn around, 3 mg of >95% purity functional protein. No outsourcing to China or India. $2500, $1800