Monday, June 29, 2015

Lync/Skype4B Mobility dissected





Ah, Lync Mobility... My favorite topic.
According a leading research center, #1 cause for ulcer among Lync administrators is deploying Mobility.
Just kidding... about the research. The rest is true -you just take a look at the TechNet forums.
Today I decided to take another look at the Mobility subject and attempt to clarify some of the major misconceptions surrounding this particular Lync/Skype4B server modality.


Core principals



Like any other SIP endpoint, Mobile apps use SIP signaling to sign-in, send and receive IMs, and/or negotiate voice/video calls. However, unlike the rest of the family, this SIP signaling is encapsulated, for lack of better word, within SSL (HTTPS) traffic. So, while the so-called “fat client” connects to the SIP server service directly, Mobile client does so via the UCWA virtual web site, and then proxies to the SIP Service on behalf of the endpoint.

When it comes to media, P2P or a meeting, the above still applies (the call setup), but media flows exactly as it would between two "normal" clients. That is, if the Mobile client is on Internal Wi-Fi and the other endpoint (desktop or Mobile) is on the same internal network, the media would flow Peer-To-Peer. If the Mobile endpoint is on the public Internet and the other endpoint is on the internal network, the Mobile endpoint device would use the Edge server in the deployment. This holds true if the Mobile endpoint device and the other endpoint are both on the Internet.

The net takeaway so far is:

Mobile device will use SIP encapsulated within HTTPS for signaling

HTTPS traffic will flow through the Reverse Proxy

Media will flow P2P or via the Edge server, depending on the endpoint’s physical location.


DNS




Beginning with Lync 2010, a new service was introduced: lyncdiscoverinternal and lyncdiscover. This (DNS) record has become the preferred method to discover Lync registrar services across all clients. However, there is one very important difference between Desktop and Mobile clients - while desktop client have a built-in DNS fallback mechanism, mobile clients work only with the auto discovery service and, if auto discovery is not available or not working correctly, mobile sign-in will fail.

This auto discovery service is provided by the Autodiscover virtual web site on the Director or Front End pool and is present in both Internal and External web sites. For this reason, it is required to use "FQDN override" in the Topology where the Internal and External web site have different FQDNs as shown below. For example, this is my Enterprise pool:





Note that while the pool FQDN is pool1.skypeuc.com, the Internal web site is webaint.skupeuc.com and the External web site FQDN - webaext.skupeuc.com.


If you have a Standard edition pool, the Internal web site FQDN cannot be changed, but the external still can and must be changed:





As discussed in the previous article, based on the which web site client query, the infrastructure will respond accordingly.

Service discovery process 


Clients will first query DNS for lyncdscoverinternal.contoso.com.
  • If the record is present, (all) clients will attempt to connect first using HTTP (non-encrypted). If the connection is successful (i.e. the target listens on port 80), the web site will respond with redirect to https://lyncdscoverinternal.contoso.com where the client receives XML containing the web URL where the client should go to authenticate and receive the web ticket.



  • If the client dies not get a response from the HTTP call, it will attempt HTTPS directly. If both (HTTP and HTTPS) connection attempts fail, it gets interesting:

Desktop client will fail back and attempt to use SRV records (_sipinternaltls._tcp.contoso.com, etc.). If SRV records are not present, client will attempt to resolve the host (A) record for sip.contoso.com. If this fails as well, the desktop client will not be able to sign in.

Mobile client does not have fallback mechanism. If Lync autodoscover service is not available (either because of DNS resolution or unavailability of the service), the mobile client will fail to sign-in!
  • If lyncdiscoverinternal is not resolvable, clients will try to resolve lyncdiscover. The above still apply.

Mobile Device sign-in flow


The following conditions apply for this example:

  • Two sip domains are supported -skypeuc.com (primary) and lynclog.com (additional)
  • Simple URL and autodiscover services are pointed to (and served by) Pool1.
  • The internal VIP terminate SSL session with wild card certificate with SN=*.skypeuc.com and SAN=*skypeuc.com
  • The account we use to sign-in have sip-uri @lynclog.com
  • The account is homed on Pool2
  • The device is on corporate Wi-Fi and it is BYOD (not managed)
  • Besides skypeuc.com DNS zone, the administrator maintains pinpoint DNS zone for lynclog.com as well
  • The enterprise does not allow hairpining. Instead, an internal VIP's were created to act as Reverse Proxy for clients requests when on corporate Wi-Fi  to the external pool web services. The VIPs use *.skypeuc.com (Wild Card) certificate.
  • The FQDN's of the external web sites (served by the "internal" Reverse Proxy) are resolvable by the internal DNS

Mobile client queries DNS and resolves autodiscoverinternal.lynclog.com to internal IP address. The IP is a VIP of hardware load balancer serving Pool1.

By design, the first attempt the endpoint makes is http://lyncdiscoverinternal.lynclog.com. Because the call is HTTP, no certificate trust is required, the connection succeeds and autodiscover service returns JSON (JavaScript Object Notation) with re-direct to authentication URL. 


***Note that now HTTPS is required and the URL is webaint.skypeuc.com (the internal VIP of Pool1). The endpoint follows the instructions, SSL connection is now terminated with *.skypeuc.com certificate and trust is established.

Endpoint attempts to receive WebTicket where it is challenged with NTLM authentication mechanism.



After successful authentication, endpoint receives XML with service location.
 


***XML points to the user's home pool resources.

Endpoint goes to Pool2 external web service FQDN and presents WebTicket.




Because this is first time this end point signs with this account, endpoint also requests certificate (since it is internal, it will do so via the internal web service FQDN)



Receiving certificate require new authentication



Certificate is received



Hallelujah, we have signed-in


***The device then connects to Exchange, but this is out of the current scope.

The device receives  Mobile Policy via inband provisioning



...instruction set for allowed modalities...



...MRAS credentials (because media will flow via the Edge server)...



...and, at the end, presence information of user's contacts.

The process of sign-in from public Internet is very similar, just (because we are coming from internet) no calls to internal resources are made.


Hairpining


Haipining is a method for hosts on LAN to leave the perimeter via NAT (like it does to reach resources on internet), and make a U-turn to access enterprise resources exposed to Internet.

For example:

Internal devices are on 10.255.3.0 sibnet
All devices go to Internet via NATed public IP address 71.14.14.42
The Reverse proxy VIP for Pool1 have public IP 71.14.14.46

To performs "hairpining", device with LAN IP 10.255.3.100 would leave the router with NATed IP of 71.14.14.42, make U-turn (not actually leave the infrastructure i.e. go to Internet) and visit 71.14.14.46. The response from the Reverse Proxy would get back to the device using the exact same path in opposite direction.

We already established that mobile devices always use the external pool web for signaling. For this reason, device on corporate Wi-Fi must be able to resolve the external web site FQDN in the internal DNS. The key word is "resolve", not "resolve to public IP address"...

The example above shows scenario where haipinning is not allowed in the enterprise. Instead, a VIP was created to act as Reverse Proxy for clients on corp Wi-Fi. The (internal, LAN) IP address of this VIP was entered in DNS as external web site IP address.

If harpiniing was allowed, the A record (in LAN DNS) for webaext.skypeuc.com would have IP address of 71.14.14.46 (the public IP address of the Reverse Proxy).

There is third method - one that bypasses the creation of "internal" reverse proxy VIP. Typical reverse proxy is "two legged" - an interface on LAN subnet (talking to the servers), and DMZ interface NATed to Public IP. The FQDN of the public web site could be entered on internal DNS with IP address - the IP of the DMZ IP address and of course, firewall configured accordingly. In this case, we "hairpin" to... DMZ.

For example:

DMZ IP of Reverse proxy - 192.168.1.46 (NATed to 71.14.14.46)
In LAN DNS - Public web site FQDN resolves to 192.168.1.46
In Public DNS - Public web site FQDN resolves to IP 71.14.14.46

In both cases, traffic flows via the reverse proxy and "lands" on the external web site.

In all three cases, mobile device will do signaling by via to the external web site.

Hairpining is preferred method because it would resolve potential issues with cashing the IP address of the external web site and the transition between corporate Wi-Fi and Carrier network will be faster.

lyncdiscoverinternal vs lyncdiscover


When hairpinning is allowed, the administrator might elect to use lyncdiscover record in internal DNS. As we see in the Fiddler trace, the XML response with service locations contain pointers to both internal and external means of connectivity. Clients will select the one that applies to it.

I am yet to see a good explanation when to use one or another.


Last words


In case you wander where the traces come from - I used Fiddler as described in this article.

The next article will be about KEMP LoadMaster (qualified Reverse Proxy)