DEV Community

BC
BC

Posted on

3

Day35:Parse URL - 100DayOfRust

We can use url library to parse a URL:

Cargo.toml:

[dependencies]
url = "2.1.1"
Enter fullscreen mode Exit fullscreen mode

Example:

use url::Url;

fn main() {
    let url = "https://github.com:8443/rust/issues?labels=E-easy&state=open";
    let parsed = Url::parse(url).unwrap();
    println!("scheme: {}", parsed.scheme());
    println!("host: {}", parsed.host().unwrap());
    if let Some(port) = parsed.port() {
        println!("port: {}", port);
    }
    println!("path: {}", parsed.path());
    println!("query: {}", parsed.query().unwrap());
}
Enter fullscreen mode Exit fullscreen mode

Run the code:

scheme: https
host: github.com
port: 8443
path: /rust/issues
query: labels=E-easy&state=open
Enter fullscreen mode Exit fullscreen mode

One thing I found that I am not sure if it is a bug in this url library: if I change the url to be:

https://github.com:443/rust/issues?labels=E-easy&state=open
Enter fullscreen mode Exit fullscreen mode

The port part will return None instead of 443. Seems this library ignored port if the port is the standard one with the protocol.

This is different behavior like Python:

from urllib.parse import urlparse
parsed = urlparse("https://github.com:443/rust/issues?labels=E-easy&state=open")
print(parsed.port)
# this will print 443 out
Enter fullscreen mode Exit fullscreen mode

In this case, if we still want to get 443, instead of using port method, we should use port_or_known_default:

    if let Some(port) = parsed.port_or_known_default() {
        println!("port: {}", port);
    }
Enter fullscreen mode Exit fullscreen mode

But the caveat is, with this method, even we don't have ":443" in the url, it will still print out 443.

[Update] Apparently this port behavior is designed on purpose, according to the source code here:

/// Note that default port numbers are never reflected by the serialization,
/// use the port_or_known_default() method if you want a default port number returned.

But I still prefer that if the url contains the port number explicitly, no matter it is the standard port, the port should always print it out, return None is just weird.

Top comments (0)

👋 Kindness is contagious

Engage with a wealth of insights in this thoughtful article, valued within the supportive DEV Community. Coders of every background are welcome to join in and add to our collective wisdom.

A sincere "thank you" often brightens someone’s day. Share your gratitude in the comments below!

On DEV, the act of sharing knowledge eases our journey and fortifies our community ties. Found value in this? A quick thank you to the author can make a significant impact.

Okay