A few notes regarding Yahoo session: YQL - very nice API, successor of the Yahoo pipes. With you could read and parse user social data, parse RSS or Twiter feeds. Feature that I most liked is possibility to parse recent multiple RSS feeds and extract information you need with a simple query like this:
select title from rss where url="http://rss.news.yahoo.com/rss/topstories" and title like '%GM%' Currently it's limited up to 10000 requests a day from a single IP address.
OAth - yahoo platform for authentication, standard that all big web guys( Facebook, Google, etc. ) are going to adopt. Very nice feature for web sites that don't want to 'outsource' authentication.
The SMTP server port to connect to, if the connect() method doesn't explicitly specify one. Defaults to 25.
mail.smtp.connectiontimeout
int
Socket connection timeout value in milliseconds. Default is infinite timeout.
mail.smtp.timeout
int
Socket I/O timeout value in milliseconds. Default is infinite timeout.
mail.smtp.from
String
Email address to use for SMTP MAIL command. This sets the envelope return address. Defaults to msg.getFrom() or InternetAddress.getLocalAddress(). NOTE: mail.smtp.user was previously used for this.
mail.smtp.localhost
String
Local host name used in the SMTP HELO or EHLO command. Defaults to InetAddress.getLocalHost().getHostName(). Should not normally need to be set if your JDK and your name service are configured properly.
mail.smtp.localaddress
String
Local address (host name) to bind to when creating the SMTP socket. Defaults to the address picked by the Socket class. Should not normally need to be set, but useful with multi-homed hosts where it's important to pick a particular local address to bind to.
mail.smtp.localport
int
Local port number to bind to when creating the SMTP socket. Defaults to the port number picked by the Socket class.
mail.smtp.ehlo
boolean
If false, do not attempt to sign on with the EHLO command. Defaults to true. Normally failure of the EHLO command will fallback to the HELO command; this property exists only for servers that don't fail EHLO properly or don't implement EHLO properly.
mail.smtp.auth
boolean
If true, attempt to authenticate the user using the AUTH command. Defaults to false.
mail.smtp.auth.mechanisms
String
If set, lists the authentication mechanisms to consider, and the order in which to consider them. Only mechanisms supported by the server and supported by the current implementation will be used. The default is "LOGIN PLAIN DIGEST-MD5", which includes all the authentication mechanisms supported by the current implementation.
mail.smtp.submitter
String
The submitter to use in the AUTH tag in the MAIL FROM command. Typically used by a mail relay to pass along information about the original submitter of the message. See also the setSubmitter method of SMTPMessage. Mail clients typically do not use this.
mail.smtp.dsn.notify
String
The NOTIFY option to the RCPT command. Either NEVER, or some combination of SUCCESS, FAILURE, and DELAY (separated by commas).
mail.smtp.dsn.ret
String
The RET option to the MAIL command. Either FULL or HDRS.
mail.smtp.allow8bitmime
boolean
If set to true, and the server supports the 8BITMIME extension, text parts of messages that use the "quoted-printable" or "base64" encodings are converted to use "8bit" encoding if they follow the RFC2045 rules for 8bit text.
mail.smtp.sendpartial
boolean
If set to true, and a message has some valid and some invalid addresses, send the message anyway, reporting the partial failure with a SendFailedException. If set to false (the default), the message is not sent to any of the recipients if there is an invalid recipient address.
mail.smtp.sasl.realm
String
The realm to use with DIGEST-MD5 authentication.
mail.smtp.quitwait
boolean
If set to false, the QUIT command is sent and the connection is immediately closed. If set to true (the default), causes the transport to wait for the response to the QUIT command.
mail.smtp.reportsuccess
boolean
If set to true, causes the transport to include an SMTPAddressSucceededException for each address that is successful. Note also that this will cause a SendFailedException to be thrown from the sendMessage method of SMTPTransport even if all addresses were correct and the message was sent successfully.
mail.smtp.socketFactory
SocketFactory
If set to a class that implements the javax.net.SocketFactory interface, this class will be used to create SMTP sockets. Note that this is an instance of a class, not a name, and must be set using the put method, not the setProperty method.
mail.smtp.socketFactory.class
String
If set, specifies the name of a class that implements the javax.net.SocketFactory interface. This class will be used to create SMTP sockets.
mail.smtp.socketFactory.fallback
boolean
If set to true, failure to create a socket using the specified socket factory class will cause the socket to be created using the java.net.Socket class. Defaults to true.
mail.smtp.socketFactory.port
int
Specifies the port to connect to when using the specified socket factory. If not set, the default port will be used.
mail.smtp.ssl.enable
boolean
If set to true, use SSL to connect and use the SSL port by default. Defaults to false for the "smtp" protocol and true for the "smtps" protocol.
mail.smtp.ssl.checkserveridentity
boolean
If set to true, check the server identity as specified by RFC 2595. These additional checks based on the content of the server's certificate are intended to prevent man-in-the-middle attacks. Defaults to false.
mail.smtp.ssl.socketFactory
SSLSocketFactory
If set to a class that extends the javax.net.ssl.SSLSocketFactory class, this class will be used to create SMTP SSL sockets. Note that this is an instance of a class, not a name, and must be set using the put method, not the setProperty method.
mail.smtp.ssl.socketFactory.class
String
If set, specifies the name of a class that extends the javax.net.ssl.SSLSocketFactory class. This class will be used to create SMTP SSL sockets.
mail.smtp.ssl.socketFactory.port
int
Specifies the port to connect to when using the specified socket factory. If not set, the default port will be used.
mail.smtp.ssl.protocols
string
Specifies the SSL protocols that will be enabled for SSL connections. The property value is a whitespace separated list of tokens acceptable to the javax.net.ssl.SSLSocket.setEnabledProtocols method.
mail.smtp.ssl.ciphersuites
string
Specifies the SSL cipher suites that will be enabled for SSL connections. The property value is a whitespace separated list of tokens acceptable to the javax.net.ssl.SSLSocket.setEnabledCipherSuites method.
mail.smtp.mailextension
String
Extension string to append to the MAIL command. The extension string can be used to specify standard SMTP service extensions as well as vendor-specific extensions. Typically the application should use the SMTPTransport method supportsExtension to verify that the server supports the desired service extension. See RFC 1869 and other RFCs that define specific extensions.
mail.smtp.starttls.enable
boolean
If true, enables the use of the STARTTLS command (if supported by the server) to switch the connection to a TLS-protected connection before issuing any login commands. Note that an appropriate trust store must configured so that the client will trust the server's certificate. Defaults to false.
mail.smtp.starttls.required
boolean
If true, requires the use of the STARTTLS command. If the server doesn't support the STARTTLS command, or the command fails, the connect method will fail. Defaults to false.
mail.smtp.userset
boolean
If set to true, use the RSET command instead of the NOOP command in the isConnected method. In some cases sendmail will respond slowly after many NOOP commands; use of RSET avoids this sendmail issue. Defaults to false.
Most home networking gear, including routers, has safely transitioned to gigabit ethernet.
The generation, storage, and transmission of large high definition video files is becoming commonplace.
If that sounds like you, or someone you know, there's one tweak you should know about that can potentally improve your local network throughput quite a bit -- enabling Jumbo Frames.
The typical UDP packet looks something like this:
But the default size of that data payload was established years ago. In the context of gigabit ethernet and the amount of data we transfer today, it does seem a bit.. anemic.
The original 1,518-byte MTU for Ethernet was chosen because of the high error rates and low speed of communications. If a corrupted packet is sent, only 1,518 bytes must be re-sent to correct the error. However, each frame requires that the network hardware and software process it. If the frame size is increased, the same amount of data can be transferred with less effort. This reduces CPU utilization (mostly due to interrupt reduction) and increases throughput by allowing the system to concentrate on the data in the frames, instead of the frames around the data.
I use my beloved energy efficient home theater PC as an always-on media server, and I'm constantly transferring gigabytes of video, music, and photos to it. Let's try enabling jumbo frames for my little network.
The first thing you'll need to do is update your network hardware drivers to the latest versions. I learned this the hard way, but if you want to play with advanced networking features like Jumbo Frames, you need the latest and greatest network hardware drivers. What was included with the OS is unlikely to cut it. Check on the network chipset manufacturer's website.
Once you've got those drivers up to date, look for the Jumbo Frames setting in the advanced properties of the network card. Here's what it looks like on two different ethernet chipsets:
That's my computer, and the HTPC, respectively. I was a little disturbed to notice that neither driver recognizes exactly the same data payload size. It's named "Jumbo Frame" with 2KB - 9KB settings in 1KB increments on the Realtek, and "Jumbo Packet" with 4088 or 9014 settings on the Marvell. I know that technically, for jumbo frames to work, all the networking devices on the subnet have to agree on the data payload size. I couldn't tell quite what to do, so I set them as you see above.
(I didn't change anything on my router / switch, which at the moment is the D-Link DGL-4500; note that most gigabit switches support jumbo frames, but you should always verify with the manufacturer's website to be sure.)
I then ran a few tests to see if there was any difference. I started with a simple file copy.
Default network settings
Jumbo Frames enabled
My file copy went from 47.6 MB/sec to 60.0 MB/sec. Not too shabby! But this is a very ad hoc sort of testing. Let's see what the PassMark Network Benchmark has to say.
Default network settings
Jumbo Frames enabled
This confirms what I saw with the file copy. With jumbo frames enabled, we go from 390,638 kilobits/sec to 477,927 kilobits/sec average. A solid 20% improvement.
Now, jumbo frames aren't a silver bullet. There's a reason jumbo frames are never enabled by default: some networking equipment can't deal with the non-standard frame sizes. Like all deviations from default settings, it is absolutely possible to make your networking worse by enabling jumbo frames, so proceed with caution. This SmallNetBuilder article outlines some of the pitfalls:
1) For a large frame to be transmitted intact from end to end, every component on the path must support that frame size.
The switch(es), router(s), and NIC(s) from one end to the other must all support the same size of jumbo frame transmission for a successful jumbo frame communication session.
2) Switches that don't support jumbo frames will drop jumbo frames.
In the event that both ends agree to jumbo frame transmission, there still needs to be end-to-end support for jumbo frames, meaning all the switches and routers must be jumbo frame enabled. At Layer 2, not all gigabit switches support jumbo frames. Those that do will forward the jumbo frames. Those that don't will drop the frames.
3) For a jumbo packet to pass through a router, both the ingress and egress interfaces must support the larger packet size. Otherwise, the packets will be dropped or fragmented.
If the size of the data payload can't be negotiated (this is known as PMTUD, packet MTU discovery) due to firewalls, the data will be dropped with no warning, or "blackholed". And if the MTU isn't supported, the data will have to be fragmented to a supported size and retransmitted, reducing throughput.
In addition to these issues, large packets can also hurt latency for gaming and voice-over-IP applications. Bigger isn't always better.
Still, if you regularly transfer large files, jumbo frames are definitely worth looking into. My tests showed a solid 20% gain in throughput, and for the type of activity on my little network, I can't think of any downside.
Datastore is transactional: writes require disk access
Disk access means disk seeks
Rule of thumb: 10ms for a disk seek
Simple math: 1s / 10ms = 100 seeks/sec maximum
Depends on: * The size and shape of your data * Doing work in batches (batch puts and gets)
Reads are cheap!
Reads do not need to be transactional, just consistent
Data is read from disk once, then it's easily cached
All subsequent reads come straight from memory
Rule of thumb: 250usec for 1MB of data from memory
Simple math: 1s / 250usec = 4GB/sec maximum * For a 1MB entity, that's 4000 fetches/sec
Numbers Miscellaneous
This group of numbers is from a presentation Jeff Dean gave at a Engineering All-Hands Meeting at Google.
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns
Mutex lock/unlock 100 ns
Main memory reference 100 ns
Compress 1K bytes with Zippy 10,000 ns
Send 2K bytes over 1 Gbps network 20,000 ns
Read 1 MB sequentially from memory 250,000 ns
Round trip within same datacenter 500,000 ns
Disk seek 10,000,000 ns
Read 1 MB sequentially from network 10,000,000 ns
Read 1 MB sequentially from disk 30,000,000 ns
Send packet CA->Netherlands->CA 150,000,000 ns
The Lessons
Writes are 40 times more expensive than reads.
Global shared data is expensive. This is a fundamental limitation of distributed systems. The lock contention in shared heavily written objects kills performance as transactions become serialized and slow.
Architect for scaling writes.
Optimize for low write contention.
Optimize wide. Make writes as parallel as you can.